Synthetic DNA sequences having enhanced expression in monocotyledonous plants and method for preparation thereof

ABSTRACT

A method for modifying a foreign nucleotide sequence for enhanced accumulation of its protein product in a monocotyledonous plant and/or increasing the frequency of obtaining transgenic monocotyledonous plants which accumulate useful amounts of a transgenic protein by reducing the frequency of the rare and semi-rare monocotyledonous codons in the foreign gene and replacing them with more preferred monocotyledonous codons is disclosed. In addition, a method for enhancing the accumulation of a polypeptide encoded by a nucleotide sequence in a monocotyledonous plant and/or increasing the frequency of obtaining transgenic monocotyledonous plants which accumulate useful amounts of a transgenic protein by analyzing the coding sequence in successive six nucleotide fragments and altering the sequence based on the frequency of appearance of the six-mers as to the frequency of appearance of the rarest 284, 484, and 664 six-mers in monocotyledonous plants is provided. Also disclosed are novel structural genes which encode insecticidal proteins of  B. t. k . and monocotyledonous (e.g. maize) plants containing such novel structural genes.

This is a divisional of application Ser. No. 08/530,492 filed Sep. 19,1995 now U.S. Pat. No. 5,689,052 which is a continuation of patentapplication Ser. No. 08/172,333 filed Dec. 22, 1993, now abandoned.

FIELD OF THE INVENTION

This invention generally relates to genetic engineering and moreparticularly to methods for enhancing the expression of a DNA sequencein a monocotyledonous plant and/or increasing the frequency of obtainingtransgenic monocotyledonous plants which accumulate useful amounts of atransgenic protein.

BACKGROUND OF THE INVENTION

One of the primary goals of plant genetic research is to providetransgenic plants which express a foreign gene in an amount sufficientto confer the desired phenotype to the plant. Significant advances havebeen made in pursuit of this goal, but the expression of some foreigngenes in transgenic plants remains problematic. It is believed thatnumerous factors are involved in determining the ultimate level ofexpression of a foreign gene in a plant, and the level of mRNA producedin the plant cells is believed to be a major factor that limits theamount of a foreign protein that is expressed in a plant.

It has been suggested that the low levels of expression observed forsome foreign proteins expressed in monocotyledonous plants (monocots)may be due to low steady state levels of mRNA in the plant as a resultof the nature of the coding sequence of the structural gene. This couldbe the result of a low frequency of full-length RNA synthesis caused bythe premature termination of RNA during transcription or due tounexpected MRNA processing during transcription. Alternatively,full-length RNA could be produced, but then processed by splicing orpolyA addition in the nucleus in a fashion that creates a nonfunctionalmRNA. It is also possible or the MRNA to be properly synthesized in thenucleus, yet not be suitable for sufficient or efficient translation inthe plant cytoplasm.

Various nucleotide sequences affect the expression levels of a foreignDNA sequence introduced into a plant. These include the promotersequence, intron sequences, the structural coding sequence that encodesthe desired foreign protein, 3′ untranslated sequences, andpolyadenylation sites. Because the structural coding region introducedinto the plant is often the only “non-plant” or “non-plant related”sequence introduced, it has been suggested that it could be asignificant factor affecting the level of expression of the protein. Inthis regard, investigators have determined that typical plant structuralcoding sequences preferentially utilize certain codons to encode certainamino acids in a different frequency than the frequency of usageappearing in bacterial or non-plant coding sequences. Thus it has beensuggested that the differences between the typical codon usage presentin plant coding sequences as compared to the typical codon usage presentin the foreign coding sequence is a factor contributing to the lowlevels of the foreign mRNA and foreign protein produced in transgenicmonocot plants. These differences could contribute to the low levels ofMRNA or protein of the foreign coding sequence in a transgenic plant byaffecting the transcription or translation of the coding sequence orproper mRNA processing. Recently, attempts have been made to alter thestructural coding sequence of a desired polypeptide or protein in aneffort to enhance its expression in the plant. In particular,investigators have altered the codon usage of foreign coding sequencesin an attempt to enhance its expression in a plant. Most notably, thesequence encoding insecticidal crystal proteins of B. thuringiensis(B.t.) has been modified in various ways to enhance its expression in aplant, particularly monocotyledonous plants, to produce commerciallyviable insect-tolerant plants.

In the European Patent Application No. 0359472 of Adang et al., asynthetic B.t. toxin gene was suggested which utilized codons preferredin highly expressed monocotyledonous or dicotyledonous proteins. In theAdang et al. gene design, the resulting synthetic gene closely resemblesa typical plant gene. That is, the native codon usage in the B.t. toxingene was altered such that the frequency of usage of the individualcodons was made to be nearly identical to the frequency of usage of therespective codons in typical plant genes. Thus, the codon usage in asynthetic gene prepared by the Adang et al. design closely resembles thedistribution frequency of codon usage found in highly expressed plantgenes.

Another approach to altering the codon usage of a B.t. toxin gene toenhance its expression in plants was described in Fischhoffet al.,European Patent Application No. 0385962. In Fischhoff et al., asynthetic plant gene was prepared by modfing the coding sequence toremove all ATTTA sequences and certain identified putativepolyadenylation signals. Moreover, the gene sequence was preferablyscanned to identify regions with greater than four consecutive adenineor thymine nucleotides and if there were more than one of the minorpolyadenylation signals identified within ten nucleotides of each other,then the nucleotide sequence of this region was altered to remove thesesignals while maintaining the original encoded amino acid sequence. Theoverall G+C content was also adjusted to provide a final sequence havinga G+C ratio of about 50%.

PCT Publication No WO 91/16432 of Cornelissen et al. discloses a methodof modifying a DNA sequence encoding a B.t. crystal protein toxinwherein the gene was modified by reducing the A+T content by changingthe adenine and thymine bases to cytosine and guanine while maintaininga coding sequence for the original protein toxin The modified gene wasexpressed in tobacco and potato. No data was provided for maize or anyother monocot.

SUMMARY OF THE INVENTION

Briefly, a method for modifying a nucleotide sequence for enhancedaccumulation of its protein or polypeptide product in a monocotyledonousplant is provided. Surprisingly, it has been found that by reducing thefrequency of usage of rare and semi-rare monocotyledonous codons in aforeign gene to be introduced into a monocotyledonous plant bysubstituting the rare and semi-rare codons with more preferredmonocotyledonous codons, the accumulation of the protein in the monocotplant expressing the foreign gene and/or the frequency of obtaining atransformed monocotyledonous plant which accumulates the insecticidalB.t. crystal protein at levels greater than 0.005 wt % of total solubleprotein is significantly improved Thus, the present invention is drawnto a method for modifying a structural coding sequence encoding apolypeptide to enhance accumulation of the polypeptide in amonocotyledonous plant which comprises determining the amino acidsequence of the polypeptide encoded by the structural coding sequenceand reducing the frequency of rare and semi-rare monocotyledonous codonsin a coding sequence by substituting the rare and semi-raremonocotyledonous codons in the coding sequence with a more-preferredmonocotyledonous codon which codes for the same amino acid.

The present invention is further directed to synthetic structural codingsequences produced by the method of this invention where the syntheticcoding sequence expresses its protein product in monocotyledonous plantsat levels significantly higher than corresponding wild-type codingsequences.

The present invention is also directed to a novel method comprisingreducing the frequency of rare and semi-rare monocotyledonous codons inthe nucleotide sequence by substituting the rare and semi-rare codonswith a more-preferred monocotyledonous codon, reducing the occurrence ofpolyadenylation signals and intron splice sites in the nucleotidesequence, removing self-complementary sequences in the nucleotidesequence and replacing such sequences with nonself-complementarynucleotides while maintaining a structural gene encoding thepolypeptide, and reducing the frequency of occurrence of 5′-CG-3′dinucleotide pairs in the nucleotide sequence, wherein these steps areperformed sequentially and have a cumulative effect resulting in anucleotide sequence containing a preferential utilization of themore-preferred monocotyledonous codons for monocotyledonous plants for amajority of the amino acids present in the polypeptide.

The present invention is also directed to a method which furtherincludes analyzing the coding sequence in successive six nucleotidefragments (six-mers) and altering the sequence based on the frequency ofappearance of the six-mers as compared to the frequency of appearance ofthe rarest 284, 484 and 664 six-mers in monocotyledonous plants. Moreparticularly, the coding sequence to be introduced into a plant isanalyzed and altered in a manner that (a) reduces the frequency ofappearance of any of the rarest 284 monocotyledonous six-mers to producea coding sequence with less than about 0.5% of the rarest 284 six-mers,(b) reduces the frequency of appearance of any of the rarest 484monocotyledonous six-mers to produce a coding sequence with less thanabout 1.5% of the rarest 484 six-mers, and (c) reduces the frequency ofappearance of any of the rarest 664 monocotyledonous six-mers to producea coding sequence with less than about 3% of the rarest 664 six-mers.

The present invention is further directed to monocotyledonous plants andseeds containing synthetic DNA sequences prepared by the methods of thisinvention.

Therefore, it is an object of the present invention to provide syntheticDNA sequences that are capable of expressing their respective proteinsat relatively higher levels that the corresponding wild-type DNAsequence and methods for the preparation of such sequences. It is aparticular object of this invention to provide synthetic DNA sequenceexpress a crystal protein toxin gene of B.t. at such relatively highlevels.

It is also an object of the present invention to provide a method forimproving protein accumulation from a foreign gene transformed into amonocotyledonous plant (particularly maize) and/or improving thefrequency of obtaining transformed monocotyledonous plants (particularlymaize) which accumulate the insecticidal B.t. crystal protein at levelsgreater than 0.005 wt. % of total soluble protein, by altering thenucleotide sequence in the coding region of the foreign gene by reducingthe frequency of codons that are infrequently utilized inmonocotyledonous plant genes and substituting frequently utilizedmonocotyledonous plant codons therefor.

BRIEF DESCRIPTION OF THE DRAWING FIGURES

FIG. 1 is a table listing the frequency of abundance of each of thecodons for each amino acid for typical monocotyledonous plant genes.

FIGS. 2A-E are lists of the most rare 284 [FIG. 2a], 484 [FIG. 2b, FIG.2c] and 664 [FIG. 2d, FIG. 2e] six-mers in typical monocotyledonousplant genes.

FIGS. 3A-C are the DNA sequence of B.t. var. kurstaki (B.t. k.) CryIA(b)modified in accordance with the teachings of the present invention (SEQID NO:1).

FIGS. 4A and 4B are the DNA sequence of the CryIIB insecticidal proteinmodified in accordance with the teachings of the present invention (SEQID NO:2).

FIGS. 5A-C are the DNA sequence of a synthetic DNA sequence encodingB.t. var. kurstaki CryIA(b)/CryIA(c) modified in accordance with onemethod of the prior art (SEQ ID NO:3).

FIG. 6 illustrates the construction of the intact CryIA(b) syntheticgene from subclones and the strategy involved;

FIG. 7 is a plasmid map of pMON19433.

FIG. 8 is a plasmid map of pMON10914.

FIGS. 9A-C are the DNA sequence of a B.t. var kurstaki insecticidalprotein wherein the front half of the coding sequence is not modifiedand the back half is modified in accordance with the method of thepresent invention (SEQ ID NO: 105).

FIG. 10 is a graphical representation of the range of expression of aB.t. DNA sequence modified in accordance with the method of the presentinvention in RO corn plants as compared to a B.t. DNA sequence preparedby a method of the prior art.

FIG. 11 illustrates the method of construction of the CryIIB DNAsequence modified in accordance with a second embodiment of the presentinvention.

FIG. 12 is a plasmid map of pMON19470.

FIGS. 13A-F are a comparison of the wild-type bacterial B. t. k.CryIA(b) DNA coding sequence (SEQ ID NO: 164) with the modified B. t. k.CryIA(b) DNA sequence as shown in FIG. 3 and identified as SEQ ID NO:1.

FIGS. 14A and 14B are the DNA sequence of the CryIIA synthetic DNAsequence which was used as the starting DNA sequence for the preparationof the CryIIB synthetic DNA according to one method of the presentinvention (SEQ ID NO:106).

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The following definitions are provided for clarity of the terms used inthe description of this invention.

“Rare monocotyledonous codons” refers to codons which have an averagefrequency of abundance in monocotyledonous plant genes of less than 10%.That is, for purposes of the present invention the rare monocotyledonouscodons include GTA, AGA, CGG, CGA, AGT, TCA, ATA, TTA and CTA.

“Semi-rare monocotyledonous codons” refers to codons which have anaverage frequency of abundance in monocotyledonous plant genes ofbetween 10%-20%. That is, for purposes of the present invention thesemi-rare monocotyledonous codons include GGG, GGA, GAA, GCA, CGT, TCG,TCT, AAA, ACA, ACT, TGT, TAT, TTG, CTT and CCT.

An “average monocotyledonous codon” refers to codons which have anaverage frequency of greater than about 20%, but are not a“more-preferred monocotyledonous codon.” That is, for purposes of thepresent invention the average monocotyledonous codons include GGT, GAT,GCT, AAT, ATT, ACG, TTT, CAT, CCG, CCA, GCG and CCC.

“More-preferred monocotyledonous codons” refers to the one or two mostabundantly utilized monocotyledonous codons for each individual aminoacid appearing in monocotyledonous plant genes as set forth in Table Ibelow.

TABLE 1 Amino Acid Preferred Codon(s) Gly GGC Glu GAG Asp GAC Val GTG,GTC Ala GCC Arg AGG, CGC Ser AGC, TCC Lys AAG Asn AAC Met ATG Ile ATCThr ACC Trp TGG Cys TGC Tyr TAC Leu CTG, CTC Phe TTC Pro CCC Gln CAG HisCAC End TAG, TGA

The determination of which codons are the more preferredmonocotyledonous codons is done by compiling a list of mostly singlecopy monocotyledonous genes, where redundant members of multigenefamilies have been removed. Codon analysis of the resulting sequencesidentifies the codons used most frequently in these genes. The monocotcodon frequencies for each amino acid as determined by such an analysisis shown in FIG. 1 and is consistent with reported codon frequencydeterminations such as in Table 4 of E. E. Murray, et al. “Codon Usagein Plant Genes” NAR 17:477-498 (1989).

It has been discovered that a nucleotide sequence capable of enhancedexpression in monocots can be obtained by reducing the frequency ofusage of the rare and semi-rare monocotyledonous codons andpreferentially utilizing the more-preferred monocotyledonous codonsfound in monocot plant genes. Therefore, the present invention providesa method for modifying a DNA sequence encoding a polypeptide to enhanceaccumulation of the polypeptide when expressed in a monocotyledonousplant. In another aspect, the present invention provides novel syntheticDNA sequences, encoding a polypeptide or protein that is not native to amonocotyledonous plant, that is expressed at greater levels in the plantthan the native DNA sequence if expressed in the plant.

The invention will primarily be described with respect to thepreparation of synthetic DNA sequences (also referred to as “nucleotidesequences, structural coding sequences or genes”) which encode thecrystal protein toxin of Bacillus thuringiensis (B. t.), but it shouldbe understood that the method of the present invention is applicable toany DNA coding sequence which encodes a protein which is not nativelyexpressed in a monocotyledonous plant that one desires to have expressedin the monocotyledonous plant.

DNA sequences modified by the method of the present invention areeffectively expressed at a greater level in monocotyledonous plants thanthe corresponding non-modified DNA sequence. In accordance with thepresent invention, DNA sequences are modified to reduce the abundance ofrare and semi-rare monocotyledonous codons in the sequence bysubstituting them with a more-preferred monocotyledonous codon. If thecodon in the native sequence is neither rare, semi-rare nor amore-preferred codon, it is generally not changed This results in amodified DNA sequence that has a significantly lower abundance of rareand semi-rare monocotyledonous codons and a greater abundance of themore-preferred monocotyledonous codons. In addition, the DNA sequence isfarther modified to reduce the frequency of CG dinucleotide pairs in themodified sequence. Preferably, the frequency of CG dinucleotides isreduced to a frequency of less than about 8% in the final modifiedsequence. The DNA sequence is also modified to reduce the occurrence ofputative polyadenylation sites, intron splice sites and potential mRNAinstability sites. As a result of the modifications, the modified DNAsequence will typically contain an abundance of between about 65%-90% ofthe more-preferred codons.

In order to construct a modified DNA sequence in accordance with themethod of the present invention, the amino acid sequence of the desiredprotein must be determined and back-translated into all the availablecodon choices for each amino acid. It should be understood that anexisting DNA sequence can be used as the starting material and modifiedby standard mutagenesis methods which are known to those skilled in theart or a synthetic DNA sequence having the desired codons can beproduced by known oligonucleotide synthesis methods. For the purpose ofbrevity and clarity, the invention will be described in terms of amutagenesis protocol. The amino acid sequence of the protein can beanalyzed using commercially available computer software such as the“BackTranslate” program of the GCG Sequence Analysis Software Package.

Because most coding sequences of proteins of interest are of asubstantial length, generally between 200-3500 nucleotides in length,the DNA sequence that encodes the protein is generally too large tofacilitate mutagenesis or complete synthesis in one step. Therefore, itwill typically be necessary to break the DNA sequence into smallerfragments of between about 300 bp to 1500 bp in length. To do this andto facilitate subsequent reassembly operations, desired restrictionsites in the sequence are identified. The restriction site sequenceswill, therefore, determine the codon usage at those sites.

The sequence of the native DNA sequence is then compared to thefrequency of codon usage for monocotyledonous plants as shown in FIG. 1.Those codons present in the native DNA sequence that are identified asbeing “rare monocotyledonous codons” are changed to the “more preferredmonocotyledonous codon” such that the percentage of raremonocotyledonous codons in the modified DNA sequence is greater thanabout 0.1% and less than about 0.5% of the total codons in the resultingmodified DNA sequence. Semi-rare monocotyledonous codons identified inthe native DNA sequence are changed to the more-preferredmonocotyledonous codon such that the percentage of semi-raremonocotyledonous codons is greater than about 2.5% and less than about10% of the total codons in the resulting modified DNA sequence and,preferably less than about 5% of the total codons in the resultingmodified DNA sequence. Codons identified in the native DNA sequence thatare “average monocotyledonous codons” are not changed.

After the rare and semi-rare monocotyledonous codons have been changedto the more preferred monocotyledonous codon as described above, the DNAsequence is further analyzed to determine the frequency of occurrence ofthe dinucleotide 5′-CG-3′. This CG dinucleotide is a known DNAmethylation site and it has been observed that methylated DNA sequencesare often poorly expressed or not expressed at all. Therefore, if thecodon changes as described above have introduced a significant number ofCG dinucleotide pairs into the modified DNA sequence, the frequency ofappearance of 5′-CG3′ dinucleotide pairs is reduced such that themodified DNA sequence has less than about 8% CG dinucleotide pairs, andpreferably less than about 7.5% CG dinucleotides pairs. It is understoodthat any changes to the DNA sequence always preserve the amino acidsequence of the native protein.

The C+G composition of the modified DNA sequence is also important tothe overall effect of the expression of the modified DNA sequence in amonocotyledonous plant. Preferably, the modified DNA sequence preparedby the method of this invention has a G+C composition greater than about50%, and preferably greater than about 55%.

The modified DNA sequence is then analyzed for the presence of anydestabilizing AT A sequences, putative polyadenylation signals or intronsplice sites. If any such sequences are present, they are preferablyremoved. For purposes of the present invention, putative polyadenylationsignals include, but are not necessarily limited to, AATAAA, AATAAT,AACCAA, ATATAA, AATCAA, ATACTA, ATAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT,AAAATA, ATTAAA, AATTAA, AATACA and CATAAA For purposes of the presentinvention, intron splice sites include, but are not necessarily limitedto WGGTAA (5′ intron splice site) and TRYAG (3′ intron splice site),where W=A or T, R=A or G, and Y=C or T. When any of the A=TA, putativepolyadenylation signals or intron splice sites are changed, they arepreferably replaced with one of the more preferred monocotyledonouscodons or one of the average monocotyledonous codons. In essence, afterthe desired codon changes have been made to the native DNA sequence toproduce the modified DNA sequence, the modified DNA sequence is analyzedaccording to the method described in commonly assigned U.S. patentapplication Ser. No. 07/476,661 filed Feb. 12, 1990, U.S. patentapplication Ser. No. 07/315,355 filed Feb. 24, 1989, and EPO 385 962published Sep. 5, 1990, the incorporation of each of such applicationsbeing hereby incorporated by reference hereto. It is to be understoodthat while all of the putative polyadenylation signals and intron splicesites are preferably removed from the modified DNA sequence, a modifiedDNA sequence according to the present invention may include one or moreof such sequences and still be capable of providing enhanced expressionin monocotyledonous plants.

The resulting DNA sequence prepared according to the above description,whether by modifying an existing native DNA sequence by mutagenesis orby the de novo chemical synthesis of a structural gene, is the preferredmodified DNA sequence to be introduced into a monocotyledonous plant forenhanced expression and accumulation of the protein product in theplant.

In a further embodiment of the present invention, an additional analysisis performed on the modified DNA sequence to further enhance itslikelihood to provide enhanced expression and accumulation of theprotein product in monocotyledonous plants. A list of raremonocotyledonous 6mer nucleotide sequences is compiled from the samelist of mostly single copy monocot genes as previously described for thecompilation of the frequency of usage of monocotyledonous codons. A 6meris six consecutive nucleotides in a sequence and proceeds in asuccessive fashion along the entire DNA sequence. That is, each adjacent6mer overlaps the previous 6mer's terminal 5 nucleotides. Thus, thetotal number of six-mers in a DNA sequence is five less than the numberof nucleotides in the DNA sequence. The frequency of occurrence ofstrings of six-mers was calculated from the list of monocotyledonousgenes and the most rare 284, 484, and 664 monocotyledonous six-mersidentified. The list of these most rare monocotyledonous six-mers isprovided in FIG. 2. The modified DNA sequence is then compared to thelists of the most rare 284, 484, and 664 monocotyledonous six-mers andif one of the rare six-mers appears in the modified DNA sequence, it isremoved by changing at least one of the nucleotides in the 6mer, but theamino acid sequence remains intact.

Preferably, any such 6mer found in the modified DNA sequence is alteredto produce a more preferred codon in the location of the 6mer.Preferably, the total number of the rarest 284 monocotyledonous six-mersin the modified DNA sequence will be less than about 1% of the totalsix-mers possible in the sequence, and more preferably less than about0.5%, the total number of the rarest 484 monocotyledonous six-mers inthe modified DNA sequence will be less than about 2% of the totalsix-mers possible in the sequence, and more preferably less than about1.0%, and the total number of the rarest 664 monocotyledonous six-mersin the modified DNA sequence will be less than about 5% of the totalsix-mers possible in the sequence, and more preferably less than about2.5%. It has been found that the removal of these 6mer sequences in thismanner is beneficial for increased expression of the DNA sequence inmonocotyledonous plants.

The method of the present invention has applicability to any DNAsequence that is desired to be introduced into a monocotyledonous plantto provide any desired characteristic in the plant, such as herbicidetolerance, virus tolerance, insect tolerance, drought tolerance, orenhanced or improved phenotypic characteristics such as improvednutritional or processing characteristics. Of particular importance isthe provision of insect tolerance to a monocotyledonous plant by theintroduction of a novel gene encoding a crystal protein toxin from B.t.into the plant. Especially preferred are the insecticidal proteins ofB.t. that are effective against insects of the order Lepidoptera andColeoptera, such as the crystal protein toxins of B.t. var. kurstakiCryIA(b) and CryIA(c) and the CryIIB protein.

The modified DNA sequences of the present invention are expressed in aplant in an amount sufficient to achieve the desired phenotype in theplant. That is, if the modified DNA sequence is introduced into themonocotyledonous plant to confer herbicide tolerance, it is designed tobe expressed in herbicide tolerant amounts. It is understood that theamount of expression of a particular protein in a plant to provide adesired phenotype to the plant may vary depending upon the species ofplant, the desired phenotype, environmental factors, and the like andthat the particular amount of expression is determined in the particularsituation by routine analysis of varying amounts or levels of expression

A preferred modified DNA sequence for the control of insects,particularly Lepidopteran type insects, is provided as SEQ ID NO: 1 andis shown in FIG. 3. A preferred modified DNA sequence expressing aneffective B.t. CryIIB protein is provided as SEQ ID NO: 2 and is shownin FIG. 4.

As will be described in more detail in the Examples to follow, thepreferred modified DNA sequences were constructed by mutagenesis whichrequired the use of numerous oligonucleotides. Generally, the method ofKunkel, T. A., Proc. Natl. Acad. Sci. USA (1985), Vol 82, pp. 488-492,for oligonucleotide mutagenesis of single stranded DNA to introduce thedesired sequence changes into the starting DNA sequence was used. Theoligonucleotides were designed to introduce the desired codon changesinto the starting DNA sequence. The preferred size for theoligonucleotides is around 40-50 bases, but fragments ranging from 17 to81 bases have been utilized. In most situations, a minimum of 5 to 8base pairs of homology to the template DNA on both ends of thesynthesized fragment are maintained to insure proper hybridization ofthe primer to the template. Multiple rounds of mutagenesis weresometimes required to introduce all of the desired changes and tocorrect any unintended sequence changes as commonly occurs inmutagenesis. It is to be understood that extensive sequencing analysisusing standard and routine methodology on both the intermediate andfinal DNA sequences is necessary to assure that the precise DNA sequenceas desired is obtained.

The expression of a foreign DNA sequence in a monocotyledonous plantrequires proper transcriptional initiation regulatory regions, i.e. apromoter sequence, an intron, and a polyadenylation site regionrecognized in monocotyledonous plants, all linked in a manner whichpermits the transcription of the coding sequence and subsequentprocessing in the nucleus. A DNA sequence containing all of thenecessary elements to permit the transcription and ultimate expressionof the coding sequence in the monocotyledonous plant is referred to as a“DNA construct.” The details of construction of such a DNA construct iswell known to those in the art and the preparation of vectors carryingsuch constructs is also well-known.

Numerous promoters are known or are found to cause transcription of RNAin plant cells and can be used in the DNA construct of the presentinvention. Examples of suitable promoters include the nopaline synthase(NOS) and octopine synthase (OCS) promoters, the light-induciblepromoter from the small subunit of ribulose bis-phosphate carboxylasepromoters, the CaMV 355 and 19S promoters, the full-length transcriptpromoter from Figwort mosaic virus, ubiquitin promoters, actinpromoters, histone promoters, tubulin promoters, or the mannopinesynthase promoter (MAS). The promoter may also be one that causespreferential expression in a particular tissue, such as leaves, stems,roots, or meristematic tissue, or the promoter may be inducible, such asby light, heat stress, water stress or chemical application orproduction by the plant. Exemplary green tissue-specific promotersinclude the maize phosphoenol pyruvate carboxylase (PEPC) promoter,small submit ribulose bis-carboxylase promoters (ssRUBISCO) and thechlorophyll a/b binding protein promoters. The promoter may also be apith-specific promoter, such as the promoter isolated from a plant TrpAgene as described in International Publication No. WO93/07278, publishedApr. 15, 1993. Other plant promoters may be obtained preferably fromplants or plant viruses and can be utilized so long as it is capable ofcausing sufficient expression in a monocotyledonous plant to result inthe production of an effective amount of the desired protein. Anypromoter used in the present invention may be modified, if desired, toalter their control characteristics. For example, the CaMV 35S or 19Spromoters may be enhanced by the method described in Kay et al. Science(1987) Vol. 236, pp.1299-1302.

The DNA construct prepared for introduction into the monocotyledonousplant also preferably contains an intron sequence which is functional inmonocotyledonous plants, preferably immediately 3′ of the promoterregion and immediately 5′ to the structural coding sequence. It has beenobserved that the inclusion of such a DNA sequence in themonocotyledonous DNA construct enhances the expression of the proteinproduct. Preferably, the intron derived from the first intron of themaize alcohol dehydrogenase gene (MzADH1) as described in Callis et al.Genes and Devel. (1987) Vol. 1, pp1183-1200, or the maize hsp70 intron(as described in PCT Publication No. WO93/19189) is used. The HSP70intron can be synthesized using the polymerase chain reaction from agenomic clone containing a maize HSP70 gene (pMON9502) Rochester et al.(1986) Embo J., 5:451-458.

The RNA produced by a DNA construct of the present invention alsocontains a 3′ non-translated polyadenylation site region recognized inmonocotyledonous plants. Various suitable 3′ non-translated regions areknown and can by obtained from viral RNA, suitable eukaryotic genes orfrom a synthetic gene sequence. Examples of suitable 3′ regions are (a)the 3′ transcribed, non-translated regions containing thepolyadenylation signal of Agrobacterium tumefaciens (Ti) plasmid genes,such as the NOS gene, and (b) plant genes such as the soybean storageprotein (7S) genes and the small subunit of the RuBP carboxylase (E9)gene.

The modified DNA sequence of the present invention may be linked to anappropriate amino-terminal chloroplast transit peptide or secretorysignal sequence to transport the transcribed sequence to a desiredlocation in the plant cell.

A DNA construct containing a structural coding sequence prepared inaccordance with the method of the present invention can be inserted intothe genome of a plant by any suitable method. Examples of suitablemethods include Agrobacterium tumefaciens mediated transformation,direct gene transfer into protoplasts, microprojectile bombardment,injection into protoplasts, cultured cells and tissues or meristematictissues, and electroporation. Preferably, the DNA construct istransferred to the monocot plant tissue through the use of themicroprojectile bombardment process which is also referred to as“particle gun technology.” In this method of transfer, the DNA constructis initially coated onto a suitable microprojectile and themicroprojectile containing DNA is accelerated into the target tissue bya microprojectile gun device. The design of the accelerating device isnot critical so long as it can produce a sufficient accelerationfunction. The accelerated microprojectiles impact upon the preparedtarget tissue to perform the gene transfer. The DNA construct used in amicroprojectile bombardment method of DNA transfer is preferablyprepared as a plasmid vector coated onto gold or tungstenmicroprojectiles. In this regard, the DNA construct will be associatedwith a selectable marker gene which allows transformed cells to grow inthe presence of a metabolic inhibitor that slows the growth ofnon-transformed cells. This growth advantage of the transgenic cellspermits them to be distinguished, over time, from the slower growing ornon-growing cells. Preferred selectable marker genes formonocotyledonous plants include a mutant acetolactate synthase DNAsequence which confers tolerance to sulfonylurea herbicides such aschlorsulfuron, the NPTI gene which confers resistance to aminoglycosidicantibiotics such as kanamycin or G418, or a bar gene (DeBlock et al.,1987, EMBO J. 6:2513-2518; Thompson et al., 1987, EMBO J. 6:2519-2523)for resistance to phosphinothiricin or bialaphos. Alternatively, or inconjunction with a selectable marker, a visual screenable marker such asthe E. coli B-glucuronidase gene or a luciferase gene can be included inthe DNA construct to facilitate identification and recovery oftransformed cells.

Suitable plants for use in the practice of the present invention includethe group of plants referred to as the monocotyledonous plants andinclude, but are not necessarily limited to, maize, rice and wheat.

The following examples are illustrative in nature and are provided tobetter elucidate the practice of the present invention and are not to beinterpreted in a limiting sense. Those skilled in the art will recognizethat various modifications, truncations, additions or deletions, etc.can be made to the methods and DNA sequences described herein withoutdeparting from the spirit and scope of the present invention.

EXAMPLE 1

This example is provided to illustrate the construction of a novel DNAsequence encoding the crystal toxin protein from B. thuringiensis var.kurstaki CryIA(b) according to the method of the present invention thatexhibits enhanced accumulation of its protein product when expressed inmaize.

As the starting DNA sequence to be modified in accordance with themethod of the present invention, the synthetic CryIA(b)/CryIA(c) DNAsequence as described in European Patent Application Publication No.0385962 was utilized. This DNA sequence encodes a fusion B.t. kurstakiprotein with the insect specificity conferred by the amino-terminalCryIA(b) portion. This DNA has been modified to remove any ATTTA sitesand any putative polyadenylation sites and intron splice sites. Thissequence is identified as SEQ ID NO: 3 and is shown in FIG. 5.

The amino acid sequence of this B.t. sequence was known and all of theavailable codon choices were determined by analyzing the amino acidsequence using the “BackTranslate” program of the GCG Sequence AnalysisSoftware Program. Because the B.t. gene is rather large (3569 bp inlength) the mutagenesis process was conducted on a plurality ofindividual, smaller fragments of the starting DNA sequence as will bedescribed below. The codon usage of the starting DNA sequence was thencompared to the monocotyledonous codon frequency table as shown in FIG.1 to determine which codons in the staring DNA sequence are rare orsemi-rare monocotyledonous codons and are to be replaced with amore-preferred monocotyledonous codon. While keeping in mind thenecessary restriction sites to facilitate religation of the DNA sequenceafter mutagenesis was complete, the modified DNA sequence design wasdetermined. The modified DNA sequence design was then analyzed for anynucleotide strings of ATTTA or putative polyadenylation sites or intronsplice sites. The modified DNA sequence design was then further modifiedto remove substantially all of such nucleotide strings, although onestring of TTTTT, TRYAG, ATTTA, and AAGCAT remained in the design. Themodified DNA sequence design was then analyzed for the occurrence of thedinucleotide 5′-CG3′ and, when possible, the modified DNA sequence wasdesigned to remove such dinucleotide pairs, although all of suchdinucleotide pairs were not removed. The resulting design is thepreferred monocotyledonous CryIA(b) DNA sequence design and thissequence was compared to the staring DNA sequence by a sequencealignment program (Bestfit program of the GCG Sequence Analysis SoftwarePackage) to determine the number of mutagenesis primers needed toconvert the starting DNA sequence into the modified DNA sequence.

The oligonucleotide mutagenesis primers were synthesized and purified byGENOSYS and the mutagenesis was carried out with the Bio-Rad Muta-GeneEnzyme Pack as described in the manufacturer's instruction manual.Following the mutagenesis reaction, a 10-30 μl aliquot of the ligationmix was transformed into JM101 cells and selected on LBr Cb50.Individual transformed colonies were picked into 96well microtiterplates containing 150 μl 2XYT Cb50. After overnight growth at 37° C.,the cultures were replicated onto S&S Nytran filters on 2XYT-Cb50 platesand allowed to grow overnight at 37° C. The filters were treated withdenaturing solution (1.5M NaCl, 0.5M NaOH) for 5 minutes, neutralizingsolution (3M NaOAc,pH5.0) for 5 minutes, air dried for 30 minutes, thenbaked for 1 hour at 80° C.

The desired mutants were identified by differential primer melt-off at65° C. Mutagenesis oligonucleotides were end-labelled with either P³² orDIG-ddUTP. When P³² oligonucleotides were used, hybridizations were doneovernight at 42° C. in 50% formamide, 3XSSPE, 5× Denhardt's, 0.1%-20%SDS and 100 ug/ml tRNA Filters were washed in 0.2XSSC, 0.1%SDS forminutes at 65° C. The filters were exposed to X-ray film for 1 hour.Colonies that contained the mutagenesis oligonucleotide retained theprobe and gave a dark spot on the X-ray film. Parental colonies notsubjected to mutagenesis were included in each screen as negativecontrols. For non-radioactive probes, the Genius DIG Oligonucleotide3′-end labelling kit was used (Boehringer-Mannheim Biochemical,Indianapolis, IN) as per the manufacturer's instructions. Hybridizationconditions were 50% formamide, 5XSSPE, 2% blocking solution,0.1%N-laurylsarcosine, 0.02% SDS, and 100 μg/ml tRNA Temperatures forhybridization and filter washes were as previously stated for theradioactive method Lumi-Phos 530 (Boehringer Mannheim) was used fordetection of hybrids, following exposures of 1 hour to X-ray film. DNAfrom the positive colonies was sequenced to confirm the desirednucleotide sequences. If further changes were needed, a new round ofmutagenesis using new oligonucleotides and the above describedprocedures were carried out.

Plasmids were transformed into the E. coli dut-, ung-, BW313 or CJ236for use as templates for mutagenesis. Fifteen mls of 2XYT mediacontaining 50 μg/ml carbenicillin was inoculated with 300 μl ofovernight culture containing one of the plasmids. The culture was grownto an OD of 0.3 and 15 μl of a stock of M13K07 helper phage was addedThe shaking culture was harvested after 5 hours. Centrifugation at 10Kfor 15 minutes removed the bacteria and cell debris. The supernatant waspassed through a 45 micron filter and 3.6 ml of 20%PEG/2.5M NaCl wasadded. The sample was mixed thoroughly and stored on ice for 30 minutes.The supernatant was centrifuged at 11K for 15 minutes. The phage pelletwas resuspended in 400 μl Tris-EDTA, pH8.0 (TIE buffer) and extractedonce with chloroform, twice with phenol:chloroform:isoamyl. Forty μl of7.5M NH₄OAc was added, then 1 ml ethanol. The DNA pellet was resuspendedin 100 μl TE.

The method employed in the construction of the modified CryIA(b) DNAsequence is illustrated in FIG. 6. The starting clones containing thestarting CryIA(b)/CryIA(c) DNA sequence included pMON10922, which wasderived from pMON19433 and is shown in FIG. 7, by replacement of the GUScoding region of pMON19433 with the NcoI-EcoRI restriction fragment frompMON10914, which is shown in FIG. 8 and which contains a pUC plasmidwith a CAMV 35S promoter/NptII/NOS 3′ cassette and an ECaMV 35Spromoter(enhanced CaMV35S promoter according to the method of Kay etal.)Adh1 intron/(DNA sequence B. t. k. CryIA(b)/CryIA(c))/NOS 3′cassette, the only sequences used from pMON10914 are between the BglIIsite (nucleotide #1) at the 5′ end of the CryIA(b)/CryIA(c) DNA sequenceand the EcoRI site (nucleotide #3569) at the 3′ end of the sequence;pMON19470 which consists of the ECaMV 3S promoter, the hsp70 intron andNOS 3′ polyA region in a pUC vector containing a NPTII selectablemarker; and pMON 19689 which is derived from pMON 10922, the 3′ regionof the CryIA(b)/CryIA(c) B.t. gene in pMON10922 was excised using XhoI(nucleotide #1839) and EcoRI (nucleotide #3569) and replaced with anoligonucleotide pair having the sequence

5′-TCGAGTGATTCGAATGAG-3′ SEQ ID NO:4, and 5′-AATTCTCATTCGAATCAC-3′ SEQID NO:5,

which creates XhoI and EcoRI cohesive ends when annealed that wereligated into pMON10922 to form pMON19689, which therefore contains atruncated CryIA(b) DNA sequence.

The five fragments of the starting CryIA(b)/CryIA(c) sequence frompMON10914 used for mutagenesis consisted of the following: pMON15740which contained the 674 bp fragment from pMON10914 from the BglII toXbaI (nucleotide #675) restriction site cloned into the BamHI and XbaIsites of Bluescript SK+; pMON15741 which contains the sequence from theXbaI site to the SacI site (nucleotide # 1354) cloned as a 679 bpXbaI-SacI fragment into the corresponding sites of Bluescript Sk+;pMON15742 which contains nucleotides between #1354-#1839 as a 485 bpSacI/XhoI fragment into the corresponding sites of Bluescript SK+; pMon10928 which was derived from pMON10922 by excising the PvuII (nucleotide#2969) to EcoRI fragment and inserting it into the EcoRV to EcoRI siteof Bluescript SK+; and pMON10927 which was derived from pMON10922 byexcising the XhoI to PvuII fragment and inserting it into the XhoI toEcoRV site of pBS SK+.

The desired sequence changes were made to the section of the startingDNA sequence in pMON15741 by the use of oligonucleotide primers BTK15,BTK16, BTK17a and 17b (sequentially) and BTK18-BTK29 as shown in Table 2below.

TABLE 2 OLIGO # SEQUENCE ID NO: BTK15 TCTAGAGACT GGATTCGCTA SEQ ID NO: 6CAACCAGTTC AGGCGCGAGC TGACCCTCAC CGTCCTGGAC ATT BTK16 ATTGTGTCCCTCTTCCCGAA SEQ ID NO: 7 CTACGACTCC CGCACCTACC C BTK17a ACCTACCCGATCCGCACCGT SEQ ID NO: 8 GTCCCAACTG ACCCGCGAAA TCT BTK17b AAATCTACACCAACCCCGTC SEQ ID NO: 9 CTGGAGAACT TC BTK18 AGCTTCAGGG GCAGCGCCCA SEQ IDNO: 10 GGGCATCGAG GGCTCCATC BTK19 GCCCACACCT GATGGACATC SEQ ID NO: 11CTCAACAGCA TCACTATCTA C BTK20 TACACCGATG CCCACCGCGG SEQ ID NO: 12CGAGTACTAC TGGTCCGGCC ACCAGATC BTK21 ATGGCCTCCC CGGTCGGCTT SEQ ID NO: 13CAGCGGCCCC GAGTT BTK22 CCTCTCTACG GCACGATGGG SEQ ID NO: 14 CAACGCCGCBTK33 CAACAACGCA TCGTCGCTCA SEQ ID NO: 15 GCTGGGCCAG GGTGTCTACA G BTK24GCGTCTACCG CACCCTGAGC SEQ ID NO: 16 TCCACCCTGT ACCGCAGGCC CTTCAACATCGGTATC BTK25 AACCAGCAGC TGTCCGTCCT SEQ ID NO: 17 GGATGGCACT GAGTTCGCBTK26 TTCGCCTACG GCACCTCCTC SEQ ID NO: 18 CAACCTGCCC TCCGCTGTCTACCGCAAGAG CGG BTK27 AAGAGCGGCA CGGTGGATTC SEQ ID NO: 19 CCTGGACGAGATCCCACC BTK28 AATGTGCCCC CCAGGCAGGG SEQ ID NO: 20 TTTTTCCCAC AGGCTCAGCCACGT BTK29 ATGTTCCGCT CCGGCTTCAG SEQ ID NO: 21 CAACTCGTCC GTGAGC

Plasmids with the desired changes were identified by colonyhybridization with the mutagenesis oligonucleotides at temperatures thatprevent hybridization with the original template, but allowhybridization with the plasmids that had incorporated the desired targetsequence changes. In some cases unexpected sequence alterations werefound. These were corrected by the use of oligonucleotides BTK44-BTK49as shown in Table 3 below.

TABLE 3 OLIGO # SEQUENCE ID. NO: BTK44 GGGCAGCGCC CAGGGCATCG SEQ ID NO:22 AGGGCTCCAT CAG BTK45 TGCCCACCGC GGCGAGTAC SEQ ID NO: 23 BTK46CCGGTCGGCT TCAGCGGCCC SEQ ID NO: 24 CGAGTTTAC BTK47 GGCCAGGGCGTCTACCGCAC SEQ ID NO: 25 CCTGAGCTCC ACCCTGTACC GCAGGCCCTT CAACATCGGT ATCBTK48 CTGTCCGTCC TGGATGGCAC SEQ ID NO: 26 TGAGTTCGC BTK49 TCAGCAACTCGTCCGTGAGC SEQ ID NO: 27

The final DNA sequence derived from pMON15741 was introduced intopMON15753 and contains the XbaI-SacI restriction fragment carryingnucleotides #669-1348 of the modified monocotyledonous CryIA(b) DNAsequence.

The desired sequence changes were made to the section of the startingDNA sequence in pMON15742 by the use of oligonucleotide primersBTK30-BTK41 as shown in Table 4 below

TABLE 4 OLIGO # SEQUENCE ID NO: BTK30 ATGTTCTCCT GGATTCATCG SEQ ID NO:28 CAGCGCGGAG TTCAAC BTK31 TCATTCCGTC CTCCCAAATC SEQ ID NO: 29ACCCAAATCC CCCTCACCAA GTC BTK32 ACCAAGTCCA CCAACCTGGG SEQ ID NO: 30CAGCGGCACC TCCGTGGTGA AGGGCCCAGG CTT BTK33 GGCTTCACGG GCGGCGACAT SEQ IDNO: 31 CCTGCGCAGG ACCTCCCCGG GCCAGATCAG CACCCT BTK34 GCACCCTCCGCGTCAACATC SEQ ID NO: 32 ACCGCTCCCC TGTCCCAGAG GTAC GTACCGCGTC AGGATBTK35 AGGATTCGCT ACGCTAGCAC SEQ ID NO: 33 CACCAACCTG CAATTC BTK36ATCGACGGCA GGCCGATCAA TCAG SEQ ID NO: 34 BTK37 TTCTCCGCCA CCATGTCCAG SEQID NO: 35 CGGCAGCAAC CTCCAATCCG G BTK38 GCAGCTTCCG CACCGTGGGT SEQ ID NO:36 TTCACCACCC CCTTCAACTT C BTK39 AACTTCTCCA ACGGCTCCAG SEQ ID NO: 37CGTTTTCACC CTGAGCGCTC A BTK40 CTGAGCGCCC ACGTGTTCAA SEQ ID NO: 38TTCCGGCAAT GAGGTGTACA TTGACCGCAT TGAGTT BTK41 ATTGAGTTCG TGCCAGCCGA SEQID NO: 39 GGTCACCTTC GAAGGGGGGC C

Plasmids with the desired changes were identified by colonyhybridization with the mutagenesis oligonucleotides at temperatures thatprevent hybridization with the original template, but allowhybridization with the plasmids that had incorporated the desired targetsequence changes. In some cases unexpected sequence alterations werefound. These were corrected by the use of oligonucleotides BTK42-BTK43as shown in Table 5 below.

TABLE 5 OLIGO # SEQUENCE ID NO: BTK42 TGAAGGGCCC AGGCTTCACG SEQ ID NO:40 GGCGGCGACA TCCTGCGCAG GACCTC BTK43 CTAGCACCAC CAACCTGCAA SEQ ID NO:41 TTCCACACCT CCATC

The final DNA sequence derived from pMON15742 was introduced intopMON15754 and contains the SacI-BstBI restriction fragment carryingnucleotides #1348-1833 of the modified monocotyledonous CryIA(b) DNAsequence.

The desired sequence changes were made to the section of the startingDNA sequence in pMON15740 by the use of oligonucleotide primersBTK0-BTK14 as shown in Table 6 below.

TABLE 6 OLIGO # SEQUENCE ID NO: BTK00 GGGGATCCAC CATGGACAAC SEQ ID NO:42 BTK01 ATCAACGAGT GCATCCCGTA SEQ ID NO: 43 CAACTGCCTC AGCAACCCTGAGGTCGAGGT ACTTGG BTK02 GAGGTCGAGG TGCTCGGCGG SEQ ID NO: 44 TGAGCGCATCGAGACCGGTT ACACCCCCAT CG BTK03 ACATCTCCCT CTCCCTCACG SEQ ID NO: 45CAGTTCCTGC TCAG BTK04 GTGCCAGGCG CTGGCTTCGT SEQ ID NO: 46 CCTGGGCCTCGTGGACATCA TC BTK05 ATCTGGGGCA TCTTTGGCCC SEQ ID NO: 47 CTCCCAGTGGGACGCCTTCC TGGT BTK06 GTGCAAATCG AGCAGCTCAT SEQ ID NO: 48 CAACCAGAGGATCGAGGAGT TCGC BTK07 AGGCCATCAG CCGCCTGGAG SEQ ID NO: 49 GGCCTCAGCAACCTCTACCA AATCTACGCT GAGAGCTT BTK08 AGAGCTTCCG CGAGTGGGAG SEQ ID NO: 50GCCGACCCCA CTAACCC BTK09 CGCGAGGAGA TGCGCATCCA SEQ ID NO: 51 GTTCAACGACBTK10 ACAGCGCCCT GACCACCGCC SEQ ID NO: 52 ATCCCACTCT TCGCCGTCCA GAACBTK11 TACCAAGTCC CGCTCCTGTC SEQ ID NO: 53 CGTGTACGTC CAGGCCGCCAACCTGCACCT CAG BTK12 AGCTGCTGA GGGACGTCAG SEQ ID NO: 54 CGTGTTTGGCCAGAGGTGGG GCTTCGACGC CGCCACCATC AA BTK13 ACCATCAACA GCCGCTACAA SEQ IDNO: 55 CGACCTCACC AGGCTGATCG GCAACTACAC BTK14 CACGCTGTCC GCTGGTACAA SEQID NO: 56 CACTGGCCTG GAGCGCGTCT GGGGCCCTGA TTC

Plasmids with the desired changes were identified by colonyhybridization with the mutagenesis oligonucleotides at temperatures thatprevent hybridization with the original template, but allowhybridization with the plasmids that had incorporated the desired targetsequence changes. In some cases unexpected sequence alterations werefound. These were corrected by the use of oligonucleotides BTK50-BTK53as shown in Table 7 below.

TABLE 7 OLIGO # SEQUENCE ID NO: BTK50 GGCGCTGGCT TCGTCCT SEQ ID NO: 57BTK51 CAAATCTACG CTGAGAGCTT SEQ ID NO: 58 BTK52 TAACCCAGCT CTCCGCGAGGAGSEQ ID NO: 59 BTK53 CTTCGACGCC GCCACCAT SEQ ID NO: 60

The final DNA sequence derived from pMON15740 was introduced intopMON15755 and contains the NcoI-XbaI restriction fragment carrying 20nucleotides #1-669 of the modified monocotyledonous CryIA(b) DNAsequence.

The desired sequence changes were made to the section of the startingDNA sequence in pMON10927 by the use of oligonucleotide primers BTK50D-BTK53D and BTK54-BTK61, and BTK63-BTK75 as shown in Table 8 below.

TABLE 8 OLIGO # SEQUENCE ID NO: BTK50D GGGCCCCCCT TCGAAGCCGA SEQ ID NO:61 GTACGACCTG GAGAGAGC BTK51d AAGGCTGTCA ATGAGCTCTT SEQ ID NO: 62CACGTCCAGC AATCAG BTK52D CAATCAGATC GGCCTGAAGA SEQ ID NO: 63 CCGACGTCACTGACTA BTK53D ACTGACTACC ACATCGACCA SEQ ID NO: 64 AGTCTCCAAC CTCGTGGAGTGCCTCTCCGA TGAGT BTK54 ACGAGAAGAA GGAGCTGTCC SEQ ID NO: 65 GAGAAGGTGAAGCATGCCAA GCG BTK55 GGAATCTCCT CCAGGACCCC SEQ ID NO: 66 AATTTCCGCGGCATCAACA BTK56 CAGGCAGCTC GACCGCGGCT SEQ ID NO: 67 GGCGCGGCAG CACCGBTK57 AGCACCGACA TCACGATCCA SEQ ID NO: 68 GGGCGGCGAC GA BTK58 AACTACGTGACTCTCCTGGG SEQ ID NO: 69 CACTTTCGA BTK59 GAGTCCAAGC TCAAGGCTTA SEQ IDNO: 70 CACTCGCTAC CAGCTCCGCG GCTACAT BTK60 CAAGACCTCG AGATTTACCT SEQ IDNO: 71 GATCCGCTAC AACGCCAAGC A BTK61 GAGACCGTCA ACGTGCCCGG TACTGG SEQ IDNO: 72 BTK62 CTCTGGCCGC TGAGCGCCCC SEQ ID NO: 73 CAGCCCGATC GGCAAGTGTGBTK63 CCCACCACAG CCACCACTTC TC SEQ ID NO: 74 BTK64 GATGTGGGCT GCACCGACCTSEQ ID NO: 75 GAACGAGGAC CT BTK65 AAGACCCAGG ACGGCCACGA SEQ ID NO: 76GCGCCTGGC AACCT BTK66 GGCAACCTGG AGTTCCTCGA SEQ ID NO: 77 GGGCAGGGCCCCCCTGGTCG GT BTK67 GTCGGTGAGG CTCTGGCCAG SEQ ID NO: 78 GGTCAAGAGGGCTGAGAAGA A BTK68 AGGGACAAGC GCGAGAAGCT SEQ ID NO: 79 CGAGTGGGAGACCAACATCG T BTK69 GAGGCCAAGG AGAGCGTCGA SEQ ID NO: 80 CGCCCTGTTC GTGBTK70 AACTCCCAGT ACGACCGCCT SEQ ID NO: 81 GCAGGCCGAC AC BTK71 ATCCACGCTGCCGACAAGAG SEQ ID NO: 82 GGTGCACA BTK72 GCATTCGCGA GGCCTACCTG SEQ ID NO:83 CCTGAGCTGT CCGTG BTK73 GCCATCTTTG AGGAGCTGGA SEQ ID NO: 84 GGGCCGCATCTTTAC BTK74 CATTCTCCCT GTACGACGCC SEQ ID NO: 85 CGCAACGTGA TCAAGAA BTK75GGCCTCAGCT GGAATTCCTG SEQ ID NO: 86

Plasmids with the desired changes were identified by colonyhybridization with the mutagenesis oligonucleotides at temperatures thatprevent hybridization with the original template, but allowhybridization with the plasmids that had incorporated the desired targetsequence changes. In some cases unexpected sequence alterations werefound. These were corrected by the use of oligonucleotides BTK91 andBTK94 as shown in Table 9 below.

TABLE 9 OLIGO # SEQUENCE ID NO: BTK91 CAAGAGGGCT GAGAAGAAGT SEQ ID NO:87 GGAGGGACAA G BTK94 TACTGGTTCC CTCTGGCCGC SEQ ID NO: 88 TGAGCGCCCCCAGCCCGATC GGCAAGTGTG CCCACCACA

The final DNA sequence derived from pMON10927 was introduced intopMON11947 and contains the BstBI-PvuI restriction fragment carryingnucleotides #1833-2888 of the modified monocotyledonous CryIA(b) DNAsequence.

The desired sequence changes were made to the section of the startingDNA sequence in pMON10928 by the use of oligonucleotide primersBTK76-BTK90 as shown in Table 10 below.

TABLE 10 OLIG # SEQUENCE ID NO: BTK76 ATAAGCTTCA GCTGCTGGAA SEQ ID NO:89 CGTCAAGGGC CACGTGGACG TCGAGGAAC BTK77 AGAACAACCA CCGCTCCGTC SEQ IDNO: 90 CTGGTCGTCC CAGAGTGGGA BTK78 GAGTGGGAGG CTGAGGTCTC CCAAGA SEQ IDNO: 91 BTK79 CAAGAGGTCC GCGTCTGCCC SEQ ID NO: 92 AGGCCGCGGC TACATTCTCAGGGTCACCGC TTA BTK80 AAGGAGGGCT ACGGTGAGGGC SEQ ID NO: 93 TGTGTGACCA TBTK81 AACTGCGTGG AGGAGGAGGT SEQ ID NO: 94 GTACCCAAAC AACAC BTK82GACTACACCG CCACCCAGGA SEQ ID NO: 95 GGAGTACGAG GGCACCTACA CT BTK83CCTACACTTC CAGGAACAGG SEQ ID NO: 96 GGCTACGATG GTGCCTACGA GAGCAACAGCAGCGTTCCTG BTK84 CTGACTACGC TTCCGCCTAC SEQ ID NO: 97 GAGGAGAAGG CTACACBTK85 CCTACACGGA TGGCCGCAGG SEQ ID NO: 98 GACAACCCTT G BTK86 CTTGCGAGAGCAACCGCGGC SEQ ID NO: 99 TACGGCGACT ACAC BTK87 GACTACACTC CCCTGCCCGC SEQID NO: 100 CGGCTACGTT ACCA BTK88 AGGAGCTGGA GTACTTCCCG SEQ ID NO: 101GAGACTGACA AGGTGTGGA BTK89 TCGAGATCGG CGAGACCGAG SEQ ID NO: 102GGCACCTTCA T BTK90 GTGGAGCTGC TCCTGATGGA SEQ ID NO: 103 GGAGTAGAATTCCTCTAAGC T

Plasmids with the desired changes were identified by colonyhybridization with the mutagenesis oligonucleotides at temperatures thatprevent hybridization with the original template, but allowhybridization with the plasmids that had incorporated the desired targetsequence changes. In one case an unexpected sequence alteration wasfound. This was corrected by the use of oligonucleotide BTK92 as shownin Table 11 below.

TABLE 11 OLIGO # SEQUENCE ID NO: BTK92 CTGGTCGTCC CAGAGTGGGA SEQ ID NO:104 GGCTGAGGTC TCCCAAGAGG TCCGCGTCTG CCCAGGCCG

The final DNA sequence derived from pMON10928 was introduced intopMON10944 and contains the PvuII-EcoRI restriction fragment carryingnucleotides #2888-3473 of the modified monocotyledonous CryIA(b) DNAsequence.

pMON15742 was subjected to oligonucleotide mutagenesis witholigonucleotide BTK41 (SEQ ID NO: 38) to form pMON15767. The resultingB. t. k. CryIA DNA fragment of pMON15767 was excised with Sad and BstBIand inserted into the Sad and BstBI sites of pMON19689 to form pMON15768which contains the NcoI-BstBI restriction fragment which containsnucleotides 7-1811 of the starting DNA sequence attached to nucleotides1806-1833 of the modified DNA sequence.

Intermediate clones were prepared as follows: The SacI-BstBI fragmentfrom pMON15754 was inserted into the SacI-BstBI sites of pMON19689 toform pMON15762 which contains nucleotides 7-1354 of the starting DNAsequence attached to nucleotides 1348-1833 of the modified DNA sequence;the XbaI to BstBI fragment of pMON19689 was excised and replaced withthe XbaI to SacI fragment from pMON15753 and the SacI-BstBI fragmentfrom pMON15762 resulting in pMON15765 which contains a truncated B.t.CryIA(b) DNA sequence where approximately the first third of thesequence from NcoI to XbaI of the starting DNA sequence is attached toXbaI-BstBI of the modified DNA sequence. Plasmid pMON15766 was preparedby excising the NcoI-XbaI fragment of pMON15765 and replaced by theNcoI-XbaI fragment from pMON15755 to yield pMON15766. pMON15766 thusencodes a truncated CryIA(b) sequence composed of nucleotides 1-1833 ofthe modified DNA sequence.

The final fill length clones were prepared as follows: pMON10948 whichencodes the full length CryIA(b) DNA sequence prepared in accordancewith the method of this invention was made by inserting the

BstBI to PvuII CryIA(b) fragment from pMON10947 and the PvuII-EcoRIfragment from pMON10944 into the BstBI-EcoRI site of pMON15766. TheCryIA(b) B.t. DNA sequence of pMON10948 consists of the modified DNAsequence having nucleotides 1-3473; pMON10949, which encodes afull-length CryIA(b) DNA sequence where the first half of the geneconsists of nucleotides 7-1811 of the starting DNA sequence attached tonucleotides 1806-3473 of the modified DNA sequence. pMON10949 was madeby inserting the BstBI to EcoRI fragment from pMON10948 into theBstBI-EcoRI site of pMON15768. The sequence of the CryIA(b) DNA sequencein pMON10949 is identified as SEQ ID NO: 105 and is shown in FIG. 9.pMON15722 was derived from pMON10948 by excising the entire CryIA(b)modified DNA sequence cassette, including the ECaMV promoter, hsp70intron and NOS3′ polyadenylation site region, as a NotI fragment andinserting it between the NotI sites of pMON19470 (this does not changeany of the modified B. t. k. CryIA DNA sequence). pMON15774 was derivedfrom pMON10948 by excising the entire CryIA(b) DNA sequence includingthe promoter, intron, CryIA(b) coding sequence, and NOS 3′polyadenylation site region as a NotI fragment and inserted between theNotI sites of pMON19470 (this does not change any of the B.t. DNAsequences).

The resulting modified CryIA(b) DNA sequence (SEQ ID NO: 1) has a totalabundance of 0.25% rare monocotyledonous codons and 3.8% semi-raremonocotyledonous codons. The total abundance of more preferredmonocotyledonous codons is 86%. The CG dinucleotide frequency in theresulting modified DNA sequence was 7.5%. The modified CryIA(b) DNAsequence is compared to the wild-type bacterial CryIA(b) DNA sequence inFIG. 13.

EXAMPLE 2

This example illustrates the transient gene expression of the modifiedB. t. k. CryIA DNA sequence described in Example 1 and theCryIA(b)/CryIA(c) B.t. DNA sequence modified by the Fischoff et al.method in corn leaf protoplasts.

The level of expression of the modified CryIA(b) B.t. DNA sequence inpMON10948 and pMON15772 which contains the B. t. k.

CryIA DNA sequence modified in accordance with the method of the presentinvention, the dicot/modified CryIA(b) B.t. DNA sequence in pMON10949which has the 5′ half of the DNA sequence modified in accordance withthe method of Fischoffet al. and the 3′ half modified in accordance withthe method of the present invention, and the CryIA(b)/CryIA(c) DNAsequence modified by the Fischoff et al. method in pMON19493, werecompared in a transient gene expression system in corn leaf protoplasts.The protoplasts were isolated from young corn seedlings. The DNAsequences were transferred into the protoplasts by electroporation and,after allowing time for gene expression, the electroporated samples wereharvested and analyzed for gene expression. Samples were performed induplicate and the ELISA values (performed in triplicate) were averagedfor each experiment. The protein levels were measured by ELISA and thevalues indicated that 9-fold more CryIA(b) protein was produced from themodified B. t. k. CryIA DNA sequence in pMON10948 or pMON15772 than frompMON19493 containing the prior art CryIA(b)/CryIA(c) DNA sequence. Themixed B.t. DNA sequence in pMON10949 was expressed at 7 fold higherlevels than pMON19493 indicating that most of the benefit of themodified B.t. DNA sequence of this invention is in the 3′ portion of theCryIA(b) DNA sequence. This data is presented in Table 12.

TABLE 12 Construct Avg. Expt 1 Avg. Expt 2 tested (ng Btk/ml) (ngBtk/ml) 19493 13.6 8.3 10949 103 57 10948 138 66.4 15722 nd 72.7

EXAMPLE 3

This Example illustrates the expression of a modified B.t. DNA sequencemodified by the method of the present invention in stably transformedcorn cells.

Black Mexican Sweet (BMS) suspension cells were stably transformed usingthe microprojectile bombardment method and the chlorsulfuron EC9selectable marker. Transgenic calli expressing the DNA sequence wereinitially identified by their insecticidal activity against tobaccohornworm larvae in a diet assay containing the calli. B.t. proteinlevels from individual insecticidal transgenic BMS calli were measuredby ELISA from 48 calli expressing pMON15772 DNA and 45 calli expressingpMON19493. This comparison found that the average B.t. protein levelsproduced in pMON15772 calli was 6.5 fold higher than the average B.t.protein levels produced in pMON19493 calli. Western blot analysisconfirmed the ELISA results and that the shorter processed forms of theproteins, predominantly the CryIA(b) portion, were in the extracts.These results demonstrate that the B. t. k. CryIA DNA sequence modifiedaccording to the method of the present invention functions better thanthe dicot CryIA(b)/CryIA(c) DNA sequence in stably transformed corncells.

The ELISA assay used herein is a direct double antibody sandwich thatutilizes a single polyclonal rabbit antibody against CryIA(b) as antigen(F137) for the capture and detection of the B. t. k. CryIA protein.Unconjugated antibody is used to coat 96 well polystyrene dishes.Alkaline phosphatase conjugated F137 antibody is added to the antibodycoated dishes along with the test extracts or purified standard andallowed to incubate. The amount of B. t. k. CryIA protein present in thesample is directly proportional to the amount of alkalinephosphate-antibody bound. Color development with the p-nitrophenylphosphate allows for quantitation of the CryIA(b) concentration in thesamples using linear regression of the calibration curve prepared withthe purified CryIA(b) protein standard.

Because the CryIA(b)/CryIA(c) protein differs from the CryIA(b) proteinin the carboyl terminus region, it needed to be confirmed that the ELISAmeasurements were accurately quantitating the CryIA(b)/CryIA(c) andCryIA(b) proteins produced from the full length synthetic DNA sequences.A trypsin treatment was used to produce identical amino terminaltruncated CryIA(b) proteins in each extract. Bovine pancreatic trypsin(Calbiochem) was prepared as a 5 mg/ml solution in 50 mM sodiumcarbonate, pH8.5-9 and 3.5 μl of the trypsin solution was added per 100μl tissue extract, mixed and incubated at 23° C. for 1.5 hours. Thereaction was stopped by the addition of 2.5 μl of a 50 mM solution ofPMSF in isopropanol, per 100 μl extract.

A Western blot of the trypsin treated and untreated samples demonstratedthat adding trypsin did convert the CryIA(b)/CryIA(c) and CryIA(b)proteins into a truncated size identical to the no terminal portion oftrypsin treated bacterial CryIA(b). The abundance of the B.t. proteinsof either the untreated or trypsin treated samples was comparable tothose found by the ELISA measurements of the protoplast extracts. Thisconfirms that the ELISA assays accurately measure the amount of B.t.protein present, regardless of whether it is CryIA(b)/CryIA(c) orCryIA(b). The Western blot independently confirmed that the CryIA(b) DNAsequence prepared in accordance with the method of the present inventionand the mixed prior art/ modified CryIA(b) DNA sequence expressed atconsiderably greater levels than the B.t. fill length synthetic DNAsequence of the prior art in pMON19493.

Additionally, the Western blot revealed that in protoplast extracts aconsiderable portion of the B.t. protein, either CryIA(b)/CryIA(c) orCryIA(b), was present as shorter, processed form of the full-length B.t.protein. Similar processed B.t. protein forms are present in extractsfrom both transgenic callus and plant tissue. This further explains whythe ELISA assay provides accurate results against both theCryIA(b)/CryIA(c) and CryIA(b) proteins from the full-length DNAsequences, as it is effectively measuring the same amino terminalportions of the proteins.

EXAMPLE 4

This Example illustrates the expression of pMON15772 and pMON19493 intransgenic corn plants.

A highly embryogenic, friable Type II callus culture is the preferredtissue for obtaining transgenic, whole corn plants. The age of theembryogenic culture can be from the initial callus formation on theimmature embryos, approximately one week after embryo isolation, toolder established cultures of 6 months to 2 years old, however, it ispreferred to use younger cultures to enhance the potential for recoveryof fertile transgenic plants. Type II cultures were initiated fromimmature Hi-II embryos on N6 2-100-25 medium containing 10 μl silvernitrate and solidified with 0.2% Phytagel. The most friable Type IIcalli were picked after about two weeks growth, and transferred ontofresh N6 2-100-25 medium containing 10 uM silver nitrate, in the centerof the plate, in preparation for bombardment.

Four days after the calli were picked and transferred, the corn cellwere bombarded 2 or 3 times with M10 tungsten particles coated withpMON15772 or pMON19493 mixed with pMON19574 as the selectable markerplasmid, using the particle preparation protocol described below. M10particles at 100 mg/ml in 50% glycerol are sonicated to resuspend theparticles. An aliquot of 12.54 is placed into a small microfuge tube and2.5 μl of the desired DNA at 1 μg/μl is added and mixed well bypipetting up and down rapidly several times. A freshly preparedCaCl₂/spermidine pre-mix is added in an amount of 17.5 μl and againmixed thoroughly. The particles are allowed to settle undisturbed forabout 20 minutes and then 12.5 μl of the supernatant was removed. Theparticles are ready for use and are used in microprojectile bombardmentwithin one hour of their preparation.

After bombardment, the cells were transferred to fresh N6 2-100-25medium containing 10 μM silver nitrate for seven days without anyselective pressure. The cells were then transferred to N6 1-0-25 mediacontaining 3 mM glyphosate. Two weeks later, the cells were transferredto fresh selective media of the same composition. After a total of 6 or7 weeks post-bombardment, glyphosate-tolerant calli could be observedgrowing on the selection media. Occasionally, the cell population wouldbe transferred to fresh selective plates at this time to carry on theselection for 10-12 weeks total time. Glyphosate resistant calli werepicked onto fresh N6 1-0-25 media containing 3 mM glyphosate forincreasing the amount of callus tissue prior to initiating plantregeneration.

Plant regeneration was initiated by placing the transgenic callus tissueon MS 0.1 ID media for two weeks. At two weeks, the tissue wastransferred to N6 6% OD media for another two week period. Theregenerating tissues are then transferred to MS 0 D media andtransferred into lighted growth chambers. After another two weeks in thesame media in larger containers, the young plants are hardened off,followed by transfer to the greenhouse where they were maintained in thesame manner as normal corn plants. In most instances, the regenerationprocess was performed with 0.01 mM glyphosate in the regeneration media.

The corn plants were allowed to grow and the level of B. t. k. CryIAprotein expressed in the leaves of the plant were measured by ELISA. Asis commonly observed in transgenic plants, a large range of expressionvalues were observed and, therefore, a large number of independentlyderived transgenic plants were examined. The B.t. levels in 44 pMON19493plants and 86 pMON15722 plants were measured by ELISA assays of leafmaterial. Each line of plants were derived from embryogenic callusexpressing the B.t. DNA sequence as determined by insecticidal activityagainst tobacco hornworm. Thus, the percentage of transformants that donot express the B.t. DNA sequence, as occurs in the transformationprocess, are not included in the data set. Western blots demonstratedthat the majority of B.t. protein in the leaf extracts was processed tothe predominantly CryIA(b) form of the protein, which has been shown tobe recognized equivalently by the ELISA antibody assay. These resultsillustrate that the average level of B.t. expression with pMON15722plants is at least 5 fold higher than the average level of B.t.expression from pMON19493 plants as shown in Table 13.

TABLE 13 B.t. protein (% of total protein) Gene <0.001 <0.005 <0.025<0.05 <0.1 >0.1 pMON19493 37 4 3 0 0 0 pMON15722 25 43 5 4 3 6

This data is presented in graph form in FIG. 10.

EXAMPLE 5

This example illustrates the preparation of another form of a crystaltoxin protein from B.t., namely the CryIIB DNA sequence (Widner et al.,J. Bact. 171: 965-974), according to the method of the present inventionand also utilizing the 6mer analysis of the DNA sequence to construct amodified DNA sequence that exhibits enhanced expression in amonocotyledonous plant.

The starting DNA sequence for this Example was the CryIIA synthetic DNAsequence identified by SEQ ID NO: 106. The CryIIB synthetic DNA sequencewas constructed from SEQ ID NO: 106 by a new gene construction process.The CryIIA gene was used as a template for annealing oligonucleotides.These oligonucleotides fit precisely adjacent to each other such thatDNA ligase could close the gap to form a covalent linkage. After theligation reaction, the linked oligonucleotides were amplified by PCR andsubcloned. Thus, this process is a form of oligonucleotide mutagenesisthat ligates the oligonucleotides into one contiguous fragment of thedesired new sequence. Because of the large size of the CryIIA gene, theprocess was carried out on five smaller fragments designated A, B, C, Dand E. A representation of the steps by which the CryIIB synthetic DNAsequence of the present invention was prepared is presented in FIG. 11.

A double stranded plasmid containing the CryIIA synthetic DNA sequence(SEQ ID NO: 106) in pBSKS+, referred to hereinafter as the P2syn DNAsequence, was digested and used as an annealing template for thedifferent oligonucleotide combinations. For the A fragment,oligonucleotides A1 through A4, as shown in Table 14, were annealed tolinearized pP2syn and ligated with T4 DNA ligase. The new strand of thecontiguous oligonucleotides was amplified using primers AP5 and AP3, asshown in Table 14, under standard PCR conditions. The amplified doublestranded fragment was digested with the restriction enzymes XbaI andBamHI and cloned into similarly digested pBSKS+ to form pMON19694.

TABLE 14 OLIGO # SEQUENCE ID NO: A1 TCTAGAAGAT CTCCACCATG SEQ ID NO: 107GACAACTCCG TCCTGAACTC TGGTCGCACC ACCATCT A2 GCGACGCCTA CAACGTCGCG SEQ IDNO: 108 GCGCATGATC CATTCAGCTT CCAGCACAAG AGCCTCGACA CTGTTCAGAA A3GGAGTGGACG GAGTGGAAGA SEQ ID NO: 109 AGAACAACCA CAGCCTGTAC CTGGACCCCATCGTCGGCAC GGTGGCCAGC TTCCT A4 TCTCAAGAAG GTCGGCTCTC SEQ ID NO: 110TCGTCGGGAA GCGCATCCTC TCGGAACTCC GCAACCTGAT CAGGATCC AP5 CCATCTAGAAGATCTCCACC SEQ ID NO: 111 AP3 TGGGGATCCT GATCAGGTTG SEQ ID NO: 112

For the B fragment, oligonucleotides B1 through B6, were annealed topP2syn and ligated with T4 DNA ligase. The new strand of the contiguousoligonucleotides was amplified using primers BP5 and BP3, as shown inTable 15, under standard PCR conditions. The amplified double strandedfragment was digested with the restriction enzymes BglII and PstI andcloned into similarly digested pMON19694 to form pMON19700.

TABLE 15 OLIGO # SEQUENCE ID NO: B1 AGATCTTTCC ATCTGGCTCC SEQ ID NO:ACCAACCTCA TGCAAGACAT 113 CCTCAGGGAG ACCGAGAAGT TTCTCAACCA GCGCCTCAAC AB2 CTGATACCCT TGCTCGCGTC SEQ ID NO: AACGCTGAGC TGACGGGTCT 114 GCAAGCAAACGTGGAGGAGT TCAACCGCCA AGTGG B3 ACAACTTCCT CAACCCCAAC SEQ ID NO:CGCAATGCGG TGCCTCTGTC CATCA 115 B4 CTTCTTCCGT GAACACCATG SEQ ID NO:CAACAACTGT TCCTCAACCG 116 CTTGCCTCAG TTCCAGATGC AAGGC B5 TACCAGCTGCTCCTGCTGCC SEQ ID NO: ACTCTTTGCT CAGGCTGCCA 117 ACCTGCACCT CTCCTTCATTCGTGACGTG B6 ATCCTCAACG CTGACGAGTG SEQ ID NO: GGGCATCTCT GCAG 118 BP5CCAAGATCTT TCCATCTGGC SEQ ID NO: 119 BP3 GGTCTGCAGA GATGCCCCAC SEQ IDNO: 120

For the C fragment, oligonucleotides C1 through C7, as shown in Table16, were annealed to pP2syn and ligated with T4 DNA ligase. The newstrand of the contiguous oligonucleotides was amplified using primersCP5 and CP3, as shown in Table 16, under standard PCR conditions. Theamplified double stranded fragment was digested with the restrictionenzymes PstI and XhoI and cloned into similarly digested pBSKS+ to formpMON19697.

TABLE 16 OLIGO # SEQUENCE ID NO: C1 CTGCAGCCAC GCTGAGGACC SEQ ID NO:TACCGCGACT ACCTGAAGAA 121 CTACACCAGG GACTACTCCA ACTATTG C2 CATCAACACCTACCAGTCGG SEQ ID NO: CCTTCAAGGG CCTCAATACG 122 AGGCTTCACG ACATGCTGGAGTTCAGGAC C3 CTACATGTTC CTGAACGTGT SEQ ID NO: TCGAGTACGT CAGCATCTGG 123TCGCTCTTCA AG C4 TACCAGAGCC TGCTGGTGTC SEQ ID NO: CAGCGGCGCC AACCTCTACG124 CCAGCGGCTC TGGTCCCCAA CAACTCA C5 GAGCTTCACC AGCCAGGACT SEQ ID NO:GGCCATTCCT GTATTCGTTG 125 TTCCAAGTCA A C6 CTCCAACTAC GTCCTCAACG SEQ IDNO: GCTTCTCTGG TGCTCGCCTC 126 TCCAACACCT TCCCCAA C7 CATTGTTGGCCTCCCCGGCT SEQ ID NO: CCACCACAAC TCATGCTCTG 127 CTTGCTGCCA GAGTGAACTACTCCGGCGGC ATCTCGAG CP5 CCACTGCAGC CACGCTGAGG ACC SEQ ID NO: 128 CP3GGTCTCGAGA TGCCGCCGGA SEQ ID NO: 129

For the D fragment, oligonucleotides D1 through D7, as shown in Table17, were annealed to pP2syn and ligated with T4 DNA ligase. The newstrand of the contiguous oligonucleotides was amplified using primersDP5 and DP3, as shown in Table 17, under standard PCR conditions. Theamplified double stranded fragment was digested with the restrictonenzymes XhoI and KpnI and cloned into similarly digested pBSKS+ to formpMON19702.

TABLE 17 OLIGO # SEQUENCE ID NO: D1 ATTGGTGCAT CGCCGTTCAA SEQ ID NO:CCAGAACTTC AACTGCTCCA 130 CCTTCCTGCC GCCGCTGCTC ACCCCGTTCG TGAGGT D2CCTGGCTCGA CAGCGGCTCC SEQ ID NO: GACCGCGAGG GCGTGGCCAC 131 CGTCACCAACTGGCAAACC D3 GAGTCCTTCG AGACCACCCT SEQ ID NO: TGGCCTCCGG AGCGGCGCCT 132TCACGGCGCG TGGG D4 AATTCTAACT ACTTCCCCGA SEQ ID NO: CTACTTCATCAGGAACATCT CTGG 133 D5 TGTTCCTCTC GTCGTCCGCA SEQ ID NO: ACGAGGACCTCCGCCGTCCA 134 CTGCACTACA ACGAGATCAG GAA D6 CATCGCCTCT CCGTCCGGGA SEQ IDNO: CGCCCGGAGG TGCAAGGGCG 135 TACATGGTGA GCGTCCATAA C D7 AGGAAGAACAACATCCACGC SEQ ID NO: TGTGCATGAG AACGGCTCCA TGAT 136 DP5 CCACTCGAGCGGCGACATTG SEQ ID NO: GTGCATCGCC G 137 DP3 GGTGGTACCT GATCATGGAG SEQ IDNO: CCGTTCTCAT GCA 138

For the E fragment, oligonucleotides E1 through E8, as shown in Table18, were annealed to pP2syn and ligated with T4 DNA ligase. The newstrand of the contiguous oligonucleotides was amplified using primersEP5 and EP3, as shown in Table 18, under standard PCR conditions. Theamplified double stranded fragment was digested with the restrictionenzymes BamHI and KpnI and cloned into similarly digested pBSKS+ to formpMON19698.

TABLE 18 OLIGO # SEQUENCE ID NO: E1 GGATCCACCT GGCGCCCAAT SEQ ID NO:139GATTACACCG GCTTCACCAT CTCTCCAATC CACGCCACCC AAGT E2 GAACAACCAGACACGCACCT SEQ ID NO:140 TCATCTCCGA GAAGTTCGGC AACCAGGGCG ACTCCCTGAG GTE3 TCGAGCAGAA CAACACCACC SEQ ID NO:141 GCCAGGTACA CCCTGCGCGG CAACGGCAACAGCTACAACC TGTACCTGCG CGTCAGCTCC A E4 TTGGCAACTC CACCATCAGG SEQ IDNO:142 GTCACCATCA ACGGGAGGGT GTACACAGCC ACCAATGTGA ACACGACGAC CAACAATGE5 ATGGCGTCAA CGACAACGGC SEQ ID NO:143 GCCCGCTTCA GCGACATCAA C E6ATTGGCAACG TGGTGGCCAG SEQ ID NO:144 CAGCAACTCC GACGTCCCGC TGGACAT E7CAACGTGACC CTGAACTCTG SEQ ID NO:145 GCACCCAGTT CGACCTCATG AA E8CATCATGCTG GTGCCAACTA SEQ ID NO:146 ACATCTCGCC GCTGTACTGA TAGGAGCTCTGATCAGGTAC C EP5 GGAGGATCCA CCTGGCGCCC A SEQ ID NO:147 EP3 GGTGGTACCTGATCAGAGCT SEQ ID NO:148

Some sequence errors occurred during the construction process. Therepair oligonucleotides A5 and A6 were used to repair fragment A, andoligonucleotides B7-B10, C8-C10, D8-D10, and E9-E11, were used to repairfragments B-E, respectively, using the single stranded oligonucleotidemutagenesis described in Example 1. These oligonucleotides are shown inTable 19.

TABLE 19 OLIGO # SEQUENCE ID NO: A5 CCACCATGGA CAACTCCGTC SEQ ID NO:149A6 GGAAGAAGAA CAACCACAGC SEQ ID NO:150 CTGTACCTGG ACCC B7 CCACCAACCTCATGCAAGAC SEQ ID NO:151 B8 CTCAACCAGC GCCTCAACAC SEQ ID NO:152 B9CCGCAATGCG GTGCCTCTGT SEQ ID NO:153 CCATCACTTC TTCCGTG B10 CGTGACGTGATCCTCAACG SEQ ID NO:154 C8 GGACTGGCCA TTCCTGTAT SEQ ID NO:155 C9CGCCAGCGGC TCTGGTCCC SEQ ID NO:156 C10 GAAGAACTAC ACCAGGGAC SEQ IDNO:157 D8 GCTCCGACCG CGAGGGCGTG SEQ ID NO:158 D9 CTCCGGAGCG GCGCCTTCACSEQ ID NO:159 GGCGCGTGGG AATTC D10 CATCTCTGGT GTTCCTCTCG SEQ ID NO:160E9 GCGGCAACGG CAACAGCTAC SEQ ID NO:161 E10 CTCCACCATC AGGGTCACCA TC SEQID NO:162 E11 GAACATCATG CTGGTGCC SEQ ID NO:163

pMON19694 was then restricted at the PstI and XhoI sites in the PBSKS+polylinker, removing a small oligonucleotide region. The insert frompMON19697 was excised with PstI and XhoI and ligated into the PstI andXhoI digested pMON19694 to form pMON19703. pMON19703 was digested withBell and PstI, removing a small oligonucleotide region, and the BglIIand PstI digested insert from pMON19700 was ligated into pMON19703 toform pMON19705. pMON19705 was digested with XhoI and KpnI and the XhoIto KpnI excised insert of pMON19702 was ligated into pMON19705 to formpMON19706. pMON19706 was digested with BclI and KpnI and the BamHI toKpnI excised insert of pMON19701 was ligated into pMON19706 to formpMON19709. This comprises the final CryIIB sequence and contains the DNAsequence identified as SEQ ED NO: 2. This sequence contains 0.15% raremonocotyledonous codons, 9.7% semi-rare monocotyledonous codons, and hasa CG dinucleotide composition of 6.7% The resulting modified CryIIB DNAsequence also has 0.05% of the rarest 284 six-mers, 0.37% of the rarest484 six-mers, and 0.94% of the rarest 664 six-mers. The bacterialCryIA(b) DNA sequence has 9.13% of the rarest 284 six-mers, 15.5% of therarest 484 six-mers, and 20.13% of the rarest 664 six-mers. The modifiedDNA sequence as described in Example 1, the monocotyledonous modifiedB.t. CryIA(b) contains 0.35% of the rarest 284 six-mers, 1.12% of therarest 484 six-mers, and 2.1% of the rarest 664 six-mers.

pMON19709 was digested with BglII and BclI and inserted into pMON19470,a plasmid map of which is provided in FIG. 12, to form pMON15785. Thestarting DNA sequence comprising a synthetic CryIIA DNA sequenceprepared by the method of Fischoff et al. was inserted into pMON19470for use as a control for expression studies in corn.

Corn leaf protoplasts were electroporated with CryIIB plasmid DNA orCryIIA plasmid DNA using the protocol described above. The CryIIB DNAsequence, pMON15785, was compared to the CryIIA DNA sequence, pMON19486,in the same corn gene expression cassette. The protoplastelectroporation samples were done in duplicate for Western blot analysisand in triplicate for insect bioactivity assays. The protoplast extractswere assayed by diet incorporation into insect feeding assays fortobacco hornworm (THW) and European corn borer (ECB). The proteinproduced by the CryIIB DNA sequence in pMON15785 showed excellentinsecticidal activity that was superior to the insecticidal activity ofthe CryIIA DNA sequence in the same vector in pMON19486. This data ispresented in Table 20 below.

TABLE 20 % surviving insects Gene construct THW ECB pMON15785 0 5pMON19486 33 88 Control (no B.t.) 88 88

Western blots also demonstrated that more protein was detected frompMON15785 than from pMON19486. The antibody used in the Western wasraised against CryIIA, so the detection of more CryIIB is significant.Initial transgenic corn plant studies with the CryIIB DNA sequencemodified by the method of the present invention have demonstratedinsecticidal activity against the European corn borer when the insectwas feeding on leaf discs from the transgenic plant. One of fourteenindependent transgenic plants containing the modified CryIIB killed theinsect. This confirms the initial transient data that the CryIIB DNAsequence is expressed in the plant and is insecticidal to European cornborer and other Lepidopteran pests.

164 3478 base pairs nucleic acid single linear unknown 1 CCATGGACAACAACCCAAAC ATCAACGAGT GCATCCCGTA CAACTGCCTC AGCAACCCTG 60 AGGTCGAGGTGCTCGGCGGT GAGCGCATCG AGACCGGTTA CACCCCCATC GACATCTCCC 120 TCTCCCTCACGCAGTTCCTG CTCAGCGAGT TCGTGCCAGG CGCTGGCTTC GTCCTGGGCC 180 TCGTGGACATCATCTGGGGC ATCTTTGGCC CCTCCCAGTG GGACGCCTTC CTGGTGCAAA 240 TCGAGCAGCTCATCAACCAG AGGATCGAGG AGTTCGCCAG GAACCAGGCC ATCAGCCGCC 300 TGGAGGGCCTCAGCAACCTC TACCAAATCT ACGCTGAGAG CTTCCGCGAG TGGGAGGCCG 360 ACCCCACTAACCCAGCTCTC CGCGAGGAGA TGCGCATCCA GTTCAACGAC ATGAACAGCG 420 CCCTGACCACCGCCATCCCA CTCTTCGCCG TCCAGAACTA CCAAGTCCCG CTCCTGTCCG 480 TGTACGTCCAGGCCGCCAAC CTGCACCTCA GCGTGCTGAG GGACGTCAGC GTGTTTGGCC 540 AGAGGTGGGGCTTCGACGCC GCCACCATCA ACAGCCGCTA CAACGACCTC ACCAGGCTGA 600 TCGGCAACTACACCGACCAC GCTGTCCGCT GGTACAACAC TGGCCTGGAG CGCGTCTGGG 660 GCCCTGATTCTAGAGACTGG ATTCGCTACA ACCAGTTCAG GCGCGAGCTG ACCCTCACCG 720 TCCTGGACATTGTGTCCCTC TTCCCGAACT ACGACTCCCG CACCTACCCG ATCCGCACCG 780 TGTCCCAACTGACCCGCGAA ATCTACACCA ACCCCGTCCT GGAGAACTTC GACGGTAGCT 840 TCAGGGGCAGCGCCCAGGGC ATCGAGGGCT CCATCAGGAG CCCACACCTG ATGGACATCC 900 TCAACAGCATCACTATCTAC ACCGATGCCC ACCGCGGCGA GTACTACTGG TCCGGCCACC 960 AGATCATGGCCTCCCCGGTC GGCTTCAGCG GCCCCGAGTT TACCTTTCCT CTCTACGGCA 1020 CGATGGGCAACGCCGCTCCA CAACAACGCA TCGTCGCTCA GCTGGGCCAG GGCGTCTACC 1080 GCACCCTGAGCTCCACCCTG TACCGCAGGC CCTTCAACAT CGGTATCAAC AACCAGCAGC 1140 TGTCCGTCCTGGATGGCACT GAGTTCGCCT ACGGCACCTC CTCCAACCTG CCCTCCGCTG 1200 TCTACCGCAAGAGCGGCACG GTGGATTCCC TGGACGAGAT CCCACCACAG AACAACAATG 1260 TGCCCCCCAGGCAGGGTTTT TCCCACAGGC TCAGCCACGT GTCCATGTTC CGCTCCGGCT 1320 TCAGCAACTCGTCCGTGAGC ATCATCAGAG CTCCTATGTT CTCCTGGATT CATCGCAGCG 1380 CGGAGTTCAACAATATCATT CCGTCCTCCC AAATCACCCA AATCCCCCTC ACCAAGTCCA 1440 CCAACCTGGGCAGCGGCACC TCCGTGGTGA AGGGCCCAGG CTTCACGGGC GGCGACATCC 1500 TGCGCAGGACCTCCCCGGGC CAGATCAGCA CCCTCCGCGT CAACATCACC GCTCCCCTGT 1560 CCCAGAGGTACCGCGTCAGG ATTCGCTACG CTAGCACCAC CAACCTGCAA TTCCACACCT 1620 CCATCGACGGCAGGCCGATC AATCAGGGTA ACTTCTCCGC CACCATGTCC AGCGGCAGCA 1680 ACCTCCAATCCGGCAGCTTC CGCACCGTGG GTTTCACCAC CCCCTTCAAC TTCTCCAACG 1740 GCTCCAGCGTTTTCACCCTG AGCGCCCACG TGTTCAATTC CGGCAATGAG GTGTACATTG 1800 ACCGCATTGAGTTCGTGCCA GCCGAGGTCA CCTTCGAAGC CGAGTACGAC CTGGAGAGAG 1860 CCCAGAAGGCTGTCAATGAG CTCTTCACGT CCAGCAATCA GATCGGCCTG AAGACCGACG 1920 TCACTGACTACCACATCGAC CAAGTCTCCA ACCTCGTGGA GTGCCTCTCC GATGAGTTCT 1980 GCCTCGACGAGAAGAAGGAG CTGTCCGAGA AGGTGAAGCA TGCCAAGCGT CTCAGCGACG 2040 AGAGGAATCTCCTCCAGGAC CCCAATTTCC GCGGCATCAA CAGGCAGCTC GACCGCGGCT 2100 GGCGCGGCAGCACCGACATC ACGATCCAGG GCGGCGACGA TGTGTTCAAG GAGAACTACG 2160 TGACTCTCCTGGGCACTTTC GACGAGTGCT ACCCTACCTA CTTGTACCAG AAGATCGATG 2220 AGTCCAAGCTCAAGGCTTAC ACTCGCTACC AGCTCCGCGG CTACATCGAA GACAGCCAAG 2280 ACCTCGAGATTTACCTGATC CGCTACAACG CCAAGCACGA GACCGTCAAC GTGCCCGGTA 2340 CTGGTTCCCTCTGGCCGCTG AGCGCCCCCA GCCCGATCGG CAAGTGTGCC CACCACAGCC 2400 ACCACTTCTCCTTGGACATC GATGTGGGCT GCACCGACCT GAACGAGGAC CTCGGAGTCT 2460 GGGTCATCTTCAAGATCAAG ACCCAGGACG GCCACGAGCG CCTGGGCAAC CTGGAGTTCC 2520 TCGAGGGCAGGGCCCCCCTG GTCGGTGAGG CTCTGGCCAG GGTCAAGAGG GCTGAGAAGA 2580 AGTGGAGGGACAAGCGCGAG AAGCTCGAGT GGGAGACCAA CATCGTTTAC AAGGAGGCCA 2640 AGGAGAGCGTCGACGCCCTG TTCGTGAACT CCCAGTACGA CCGCCTGCAG GCCGACACCA 2700 ACATCGCCATGATCCACGCT GCCGACAAGA GGGTGCACAG CATTCGCGAG GCCTACCTGC 2760 CTGAGCTGTCCGTGATCCCT GGTGTGAACG CTGCCATCTT TGAGGAGCTG GAGGGCCGCA 2820 TCTTTACCGCATTCTCCCTG TACGACGCCC GCAACGTGAT CAAGAACGGT GACTTCAACA 2880 ATGGCCTCAGCTGCTGGAAC GTCAAGGGCC ACGTGGACGT CGAGGAACAG AACAACCACC 2940 GCTCCGTCCTGGTCGTCCCA GAGTGGGAGG CTGAGGTCTC CCAAGAGGTC CGCGTCTGCC 3000 CAGGCCGCGGCTACATTCTC AGGGTCACCG CTTACAAGGA GGGCTACGGT GAGGGCTGTG 3060 TGACCATCCACGAGATCGAG AACAACACCG ACGAGCTTAA GTTCTCCAAC TGCGTGGAGG 3120 AGGAGGTGTACCCAAACAAC ACCGTTACTT GCAACGACTA CACCGCCACC CAGGAGGAGT 3180 ACGAGGGCACCTACACTTCC AGGAACAGGG GCTACGATGG TGCCTACGAG AGCAACAGCA 3240 GCGTTCCTGCTGACTACGCT TCCGCCTACG AGGAGAAGGC CTACACGGAT GGCCGCAGGG 3300 ACAACCCTTGCGAGAGCAAC CGCGGCTACG GCGACTACAC TCCCCTGCCC GCCGGCTACG 3360 TTACCAAGGAGCTGGAGTAC TTCCCGGAGA CTGACAAGGT GTGGATCGAG ATCGGCGAGA 3420 CCGAGGGCACCTTCATCGTG GACAGCGTGG AGCTGCTCCT GATGGAGGAG TAGAATTC 3478 1931 basepairs nucleic acid single linear unknown 2 AGATCTCCAC CATGGACAACTCCGTCCTGA ACTCTGGTCG CACCACCATC TGCGACGCCT 60 ACAACGTCGC GGCGCATGATCCATTCAGCT TCCAGCACAA GAGCCTCGAC ACTGTTCAGA 120 AGGAGTGGAC GGAGTGGAAGAAGAACAACC ACAGCCTGTA CCTGGACCCC ATCGTCGGCA 180 CGGTGGCCAG CTTCCTTCTCAAGAAGGTCG GCTCTCTCGT CGGGAAGCGC ATCCTCTCGG 240 AACTCCGCAA CCTGATCTTTCCATCTGGCT CCACCAACCT CATGCAAGAC ATCCTCAGGG 300 AGACCGAGAA GTTTCTCAACCAGCGCCTCA ACACTGATAC CCTTGCTCGC GTCAACGCTG 360 AGCTGACGGG TCTGCAAGCAAACGTGGAGG AGTTCAACCG CCAAGTGGAC AACTTCCTCA 420 ACCCCAACCG CAATGCGGTGCCTCTGTCCA TCACTTCTTC CGTGAACACC ATGCAACAAC 480 TGTTCCTCAA CCGCTTGCCTCAGTTCCAGA TGCAAGGCTA CCAGCTGCTC CTGCTGCCAC 540 TCTTTGCTCA GGCTGCCAACCTGCACCTCT CCTTCATTCG TGACGTGATC CTCAACGCTG 600 ACGAGTGGGG CATCTCTGCAGCCACGCTGA GGACCTACCG CGACTACCTG AAGAACTACA 660 CCAGGGACTA CTCCAACTATTGCATCAACA CCTACCAGTC GGCCTTCAAG GGCCTCAATA 720 CGAGGCTTCA CGACATGCTGGAGTTCAGGA CCTACATGTT CCTGAACGTG TTCGAGTACG 780 TCAGCATCTG GTCGCTCTTCAAGTACCAGA GCCTGCTGGT GTCCAGCGGC GCCAACCTCT 840 ACGCCAGCGG CTCTGGTCCCCAACAAACTC AGAGCTTCAC CAGCCAGGAC TGGCCATTCC 900 TGTATTCGTT GTTCCAAGTCAACTCCAACT ACGTCCTCAA CGGCTTCTCT GGTGCTCGCC 960 TCTCCAACAC CTTCCCCAACATTGTTGGCC TCCCCGGCTC CACCACAACT CATGCTCTGC 1020 TTGCTGCCAG AGTGAACTACTCCGGCGGCA TCTCGAGCGG CGACATTGGT GCATCGCCGT 1080 TCAACCAGAA CTTCAACTGCTCCACCTTCC TGCCGCCGCT GCTCACCCCG TTCGTGAGGT 1140 CCTGGCTCGA CAGCGGCTCCGACCGCGAGG GCGTGGCCAC CGTCACCAAC TGGCAAACCG 1200 AGTCCTTCGA GACCACCCTTGGCCTCCGGA GCGGCGCCTT CACGGCGCGT GGGAATTCTA 1260 ACTACTTCCC CGACTACTTCATCAGGAACA TCTCTGGTGT TCCTCTCGTC GTCCGCAACG 1320 AGGACCTCCG CCGTCCACTGCACTACAACG AGATCAGGAA CATCGCCTCT CCGTCCGGGA 1380 CGCCCGGAGG TGCAAGGGCGTACATGGTGA GCGTCCATAA CAGGAAGAAC AACATCCACG 1440 CTGTGCATGA GAACGGCTCCATGATCCACC TGGCGCCCAA TGATTACACC GGCTTCACCA 1500 TCTCTCCAAT CCACGCCACCCAAGTGAACA ACCAGACACG CACCTTCATC TCCGAGAAGT 1560 TCGGCAACCA GGGCGACTCCCTGAGGTTCG AGCAGAACAA CACCACCGCC AGGTACACCC 1620 TGCGCGGCAA CGGCAACAGCTACAACCTGT ACCTGCGCGT CAGCTCCATT GGCAACTCCA 1680 CCATCAGGGT CACCATCAACGGGAGGGTGT ACACAGCCAC CAATGTGAAC ACGACGACCA 1740 ACAATGATGG CGTCAACGACAACGGCGCCC GCTTCAGCGA CATCAACATT GGCAACGTGG 1800 TGGCCAGCAG CAACTCCGACGTCCCGCTGG ACATCAACGT GACCCTGAAC TCTGGCACCC 1860 AGTTCGACCT CATGAACATCATGCTGGTGC CAACTAACAT CTCGCCGCTG TACTGATAGG 1920 AGCTCTGATC A 1931 3531base pairs nucleic acid single linear unknown 3 ATGGACAACA ACCCAAACATCAACGAATGC ATTCCATACA ACTGCTTGAG TAACCCAGAA 60 GTTGAAGTAC TTGGTGGAGAACGCATTGAA ACCGGTTACA CTCCCATCGA CATCTCCTTG 120 TCCTTGACAC AGTTTCTGCTCAGCGAGTTC GTGCCAGGTG CTGGGTTCGT TCTCGGACTA 180 GTTGACATCA TCTGGGGTATCTTTGGTCCA TCTCAATGGG ATGCATTCCT GGTGCAAATT 240 GAGCAGTTGA TCAACCAGAGGATCGAAGAG TTCGCCAGGA ACCAGGCCAT CTCTAGGTTG 300 GAAGGATTGA GCAATCTCTACCAAATCTAT GCAGAGAGCT TCAGAGAGTG GGAAGCCGAT 360 CCTACTAACC CAGCTCTCCGCGAGGAAATG CGTATTCAAT TCAACGACAT GAACAGCGCC 420 TTGACCACAG CTATCCCATTGTTCGCAGTC CAGAACTACC AAGTTCCTCT CTTGTCCGTG 480 TACGTTCAAG CAGCTAATCTTCACCTCAGC GTGCTTCGAG ACGTTAGCGT GTTTGGGCAA 540 AGGTGGGGAT TCGATGCTGCAACCATCAAT AGCCGTTACA ACGACCTTAC TAGGCTGATT 600 GGAAACTACA CCGACCACGCTGTTCGTTGG TACAACACTG GCTTGGAGCG TGTCTGGGGT 660 CCTGATTCTA GAGATTGGATTAGATACAAC CAGTTCAGGA GAGAATTGAC CCTCACAGTT 720 TTGGACATTG TGTCTCTCTTCCCGAACTAT GACTCCAGAA CCTACCCTAT CCGTACAGTG 780 TCCCAACTTA CCAGAGAAATCTATACTAAC CCAGTTCTTG AGAACTTCGA CGGTAGCTTC 840 CGTGGTTCTG CCCAAGGTATCGAAGGCTCC ATCAGGAGCC CACACTTGAT GGACATCTTG 900 AACAGCATAA CTATCTACACCGATGCTCAC AGAGGAGAGT ATTACTGGTC TGGACACCAG 960 ATCATGGCCT CTCCAGTTGGATTCAGCGGG CCCGAGTTTA CCTTTCCTCT CTATGGAACT 1020 ATGGGAAACG CCGCTCCACAACAACGTATC GTTGCTCAAC TAGGTCAGGG TGTCTACAGA 1080 ACCTTGTCTT CCACCTTGTACAGAAGACCC TTCAATATCG GTATCAACAA CCAGCAACTT 1140 TCCGTTCTTG ACGGAACAGAGTTCGCCTAT GGAACCTCTT CTAACTTGCC ATCCGCTGTT 1200 TACAGAAAGA GCGGAACCGTTGATTCCTTG GACGAAATCC CACCACAGAA CAACAATGTG 1260 CCACCCAGGC AAGGATTCTCCCACAGGTTG AGCCACGTGT CCATGTTCCG TTCCGGATTC 1320 AGCAACAGTT CCGTGAGCATCATCAGAGCT CCTATGTTCT CATGGATTCA TCGTAGTGCT 1380 GAGTTCAACA ATATCATTCCTTCCTCTCAA ATCACCCAAA TCCCATTGAC CAAGTCTACT 1440 AACCTTGGAT CTGGAACTTCTGTCGTGAAA GGACCAGGCT TCACAGGAGG TGATATTCTT 1500 AGAAGAACTT CTCCTGGCCAGATTAGCACC CTCAGAGTTA ACATCACTGC ACCACTTTCT 1560 CAAAGATATC GTGTCAGGATTCGTTACGCA TCTACCACTA ACTTGCAATT CCACACCTCC 1620 ATCGACGGAA GGCCTATCAATCAGGGTAAC TTCTCCGCAA CCATGTCAAG CGGCAGCAAC 1680 TTGCAATCCG GCAGCTTCAGAACCGTCGGT TTCACTACTC CTTTCAACTT CTCTAACGGA 1740 TCAAGCGTTT TCACCCTTAGCGCTCATGTG TTCAATTCTG GCAATGAAGT GTACATTGAC 1800 CGTATTGAGT TTGTGCCTGCCGAAGTTACC CTCGAGGCTG AGTACAACCT TGAGAGAGCC 1860 CAGAAGGCTG TGAACGCCCTCTTTACCTCC ACCAATCAGC TTGGCTTGAA AACTAACGTT 1920 ACTGACTATC ACATTGACCAAGTGTCCAAC TTGGTCACCT ACCTTAGCGA TGAGTTCTGC 1980 CTCGACGAGA AGCGTGAACTCTCCGAGAAA GTTAAACACG CCAAGCGTCT CAGCGACGAG 2040 AGGAATCTCT TGCAAGACTCCAACTTCAAA GACATCAACA GGCAGCCAGA ACGTGGTTGG 2100 GGTGGAAGCA CCGGGATCACCATCCAAGGA GGCGACGATG TGTTCAAGGA GAACTACGTC 2160 ACCCTCTCCG GAACTTTCGACGAGTGCTAC CCTACCTACT TGTACCAGAA GATCGATGAG 2220 TCCAAACTCA AAGCCTTCACCAGGTATCAA CTTAGAGGCT ACATCGAAGA CAGCCAAGAC 2280 CTTGAAATCT ACTCGATCAGGTACAATGCC AAGCACGAGA CCGTGAATGT CCCAGGTACT 2340 GGTTCCCTCT GGCCACTTTCTGCCCAATCT CCCATTGGGA AGTGTGGAGA GCCTAACAGA 2400 TGCGCTCCAC ACCTTGAGTGGAATCCTGAC TTGGACTGCT CCTGCAGGGA TGGCGAGAAG 2460 TGTGCCCACC ATTCTCATCACTTCTCCTTG GACATCGATG TGGGATGTAC TGACCTGAAT 2520 GAGGACCTCG GAGTCTGGGTCATCTTCAAG ATCAAGACCC AAGACGGACA CGCAAGACTT 2580 GGCAACCTTG AGTTTCTCGAAGAGAAACCA TTGGTCGGTG AAGCTCTCGC TCGTGTGAAG 2640 AGAGCAGAGA AGAAGTGGAGGGACAAACGT GAGAAACTCG AATGGGAAAC TAACATCGTT 2700 TACAAGGAGG CCAAAGAGTCCGTGGATGCT TTGTTCGTGA ACTCCCAATA TGATCAGTTG 2760 CAAGCCGACA CCAACATCGCCATGATCCAC GCCGCAGACA AACGTGTGCA CAGCATTCGT 2820 GAGGCTTACT TGCCTGAGTTGTCCGTGATC CCTGGTGTGA ACGCTGCCAT CTTCGAGGAA 2880 CTTGAGGGAC GTATCTTTACCGCATTCTCC TTGTACGATG CCAGAAACGT CATCAAGAAC 2940 GGTGACTTCA ACAATGGCCTCAGCTGCTGG AATGTGAAAG GTCATGTGGA CGTGGAGGAA 3000 CAGAACAATC AGCGTTCCGTCCTGGTTGTG CCTGAGTGGG AAGCTGAAGT GTCCCAAGAG 3060 GTTAGAGTCT GTCCAGGTAGAGGCTACATT CTCCGTGTGA CCGCTTACAA GGAGGGATAC 3120 GGTGAGGGTT GCGTGACCATCCACGAGATC GAGAACAACA CCGACGAGCT TAAGTTCTCC 3180 AACTGCGTCG AGGAAGAAATCTATCCCAAC AACACCGTTA CTTGCAACGA CTACACTGTG 3240 AATCAGGAAG AGTACGGAGGTGCCTACACT AGCCGTAACA GAGGTTACAA CGAAGCTCCT 3300 TCCGTTCCTG CTGACTATGCCTCCGTGTAC GAGGAGAAAT CCTACACAGA TGGCAGACGT 3360 GAGAACCCTT GCGAGTTCAACAGAGGTTAC AGGGACTACA CACCACTTCC AGTTGGCTAT 3420 GTTACCAAGG AGCTTGAGTACTTTCCTGAG ACCGACAAAG TGTGGATCGA GATCGGTGAA 3480 ACCGAGGGAA CCTTCATCGTGGACAGCGTG GAGCTTCTCT TGATGGAGGA A 3531 18 base pairs nucleic acidsingle linear unknown 4 TCGAGTGATT CGAATGAG 18 18 base pairs nucleicacid single linear unknown 5 AATTCTCATT CGAATCAC 18 63 base pairsnucleic acid single linear unknown 6 TCTAGAGACT GGATTCGCTA CAACCAGTTCAGGCGCGAGC TGACCCTCAC CGTCCTGGAC 60 ATT 63 41 base pairs nucleic acidsingle linear unknown 7 ATTGTGTCCC TCTTCCCGAA CTACGACTCC CGCACCTACC C 4143 base pairs nucleic acid single linear unknown 8 ACCTACCCGA TCCGCACCGTGTCCCAACTG ACCCGCGAAA TCT 43 32 base pairs nucleic acid single linearunknown 9 AAATCTACAC CAACCCCGTC CTGGAGAACT TC 32 39 base pairs nucleicacid single linear unknown 10 AGCTTCAGGG GCAGCGCCCA GGGCATCGAG GGCTCCATC39 41 base pairs nucleic acid single linear unknown 11 GCCCACACCTGATGGACATC CTCAACAGCA TCACTATCTA C 41 48 base pairs nucleic acid singlelinear unknown 12 TACACCGATG CCCACCGCGG CGAGTACTAC TGGTCCGGCC ACCAGATC48 35 base pairs nucleic acid single linear unknown 13 ATGGCCTCCCCGGTCGGCTT CAGCGGCCCC GAGTT 35 29 base pairs nucleic acid single linearunknown 14 CCTCTCTACG GCACGATGGG CAACGCCGC 29 41 base pairs nucleic acidsingle linear unknown 15 CAACAACGCA TCGTCGCTCA GCTGGGCCAG GGTGTCTACA G41 56 base pairs nucleic acid single linear unknown 16 GCGTCTACCGCACCCTGAGC TCCACCCTGT ACCGCAGGCC CTTCAACATC GGTATC 56 38 base pairsnucleic acid single linear unknown 17 AACCAGCAGC TGTCCGTCCT GGATGGCACTGAGTTCGC 38 53 base pairs nucleic acid single linear unknown 18TTCGCCTACG GCACCTCCTC CAACCTGCCC TCCGCTGTCT ACCGCAAGAG CGG 53 38 basepairs nucleic acid single linear unknown 19 AAGAGCGGCA CGGTGGATTCCCTGGACGAG ATCCCACC 38 44 base pairs nucleic acid single linear unknown20 AATGTGCCCC CCAGGCAGGG TTTTTCCCAC AGGCTCAGCC ACGT 44 36 base pairsnucleic acid single linear unknown 21 ATGTTCCGCT CCGGCTTCAG CAACTCGTCCGTGAGC 36 33 base pairs nucleic acid single linear unknown 22 GGGCAGCGCCCAGGGCATCG AGGGCTCCAT CAG 33 19 base pairs nucleic acid single linearunknown 23 TGCCCACCGC GGCGAGTAC 19 29 base pairs nucleic acid singlelinear unknown 24 CCGGTCGGCT TCAGCGGCCC CGAGTTTAC 29 63 base pairsnucleic acid single linear unknown 25 GGCCAGGGCG TCTACCGCAC CCTGAGCTCCACCCTGTACC GCAGGCCCTT CAACATCGGT 60 ATC 63 29 base pairs nucleic acidsingle linear unknown 26 CTGTCCGTCC TGGATGGCAC TGAGTTCGC 29 20 basepairs nucleic acid single linear unknown 27 TCAGCAACTC GTCCGTGAGC 20 36base pairs nucleic acid single linear unknown 28 ATGTTCTCCT GGATTCATCGCAGCGCGGAG TTCAAC 36 43 base pairs nucleic acid single linear unknown 29TCATTCCGTC CTCCCAAATC ACCCAAATCC CCCTCACCAA GTC 43 53 base pairs nucleicacid single linear unknown 30 ACCAAGTCCA CCAACCTGGG CAGCGGCACCTCCGTGGTGA AGGGCCCAGG CTT 53 56 base pairs nucleic acid single linearunknown 31 GGCTTCACGG GCGGCGACAT CCTGCGCAGG ACCTCCCCGG GCCAGATCAG CACCCT56 59 base pairs nucleic acid single linear unknown 32 GCACCCTCCGCGTCAACATC ACCGCTCCCC TGTCCCAGAG GTACGTACCG CGTCAGGAT 59 36 base pairsnucleic acid single linear unknown 33 AGGATTCGCT ACGCTAGCAC CACCAACCTGCAATTC 36 24 base pairs nucleic acid single linear unknown 34 ATCGACGGCAGGCCGATCAA TCAG 24 41 base pairs nucleic acid single linear unknown 35TTCTCCGCCA CCATGTCCAG CGGCAGCAAC CTCCAATCCG G 41 41 base pairs nucleicacid single linear unknown 36 GCAGCTTCCG CACCGTGGGT TTCACCACCCCCTTCAACTT C 41 41 base pairs nucleic acid single linear unknown 37AACTTCTCCA ACGGCTCCAG CGTTTTCACC CTGAGCGCTC A 41 56 base pairs nucleicacid single linear unknown 38 CTGAGCGCCC ACGTGTTCAA TTCCGGCAATGAGGTGTACA TTGACCGCAT TGAGTT 56 41 base pairs nucleic acid single linearunknown 39 ATTGAGTTCG TGCCAGCCGA GGTCACCTTC GAAGGGGGGC C 41 46 basepairs nucleic acid single linear unknown 40 TGAAGGGCCC AGGCTTCACGGGCGGCGACA TCCTGCGCAG GACCTC 46 35 base pairs nucleic acid single linearunknown 41 CTAGCACCAC CAACCTGCAA TTCCACACCT CCATC 35 20 base pairsnucleic acid single linear unknown 42 GGGGATCCAC CATGGACAAC 20 56 basepairs nucleic acid single linear unknown 43 ATCAACGAGT GCATCCCGTACAACTGCCTC AGCAACCCTG AGGTCGAGGT ACTTGG 56 52 base pairs nucleic acidsingle linear unknown 44 GAGGTCGAGG TGCTCGGCGG TGAGCGCATC GAGACCGGTTACACCCCCAT CG 52 34 base pairs nucleic acid single linear unknown 45ACATCTCCCT CTCCCTCACG CAGTTCCTGC TCAG 34 42 base pairs nucleic acidsingle linear unknown 46 GTGCCAGGCG CTGGCTTCGT CCTGGGCCTC GTGGACATCA TC42 44 base pairs nucleic acid single linear unknown 47 ATCTGGGGCATCTTTGGCCC CTCCCAGTGG GACGCCTTCC TGGT 44 44 base pairs nucleic acidsingle linear unknown 48 GTGCAAATCG AGCAGCTCAT CAACCAGAGG ATCGAGGAGTTCGC 44 58 base pairs nucleic acid single linear unknown 49 AGGCCATCAGCCGCCTGGAG GGCCTCAGCA ACCTCTACCA AATCTACGCT GAGAGCTT 58 37 base pairsnucleic acid single linear unknown 50 AGAGCTTCCG CGAGTGGGAG GCCGACCCCACTAACCC 37 30 base pairs nucleic acid single linear unknown 51CGCGAGGAGA TGCGCATCCA GTTCAACGAC 30 44 base pairs nucleic acid singlelinear unknown 52 ACAGCGCCCT GACCACCGCC ATCCCACTCT TCGCCGTCCA GAAC 44 53base pairs nucleic acid single linear unknown 53 TACCAAGTCC CGCTCCTGTCCGTGTACGTC CAGGCCGCCA ACCTGCACCT CAG 53 62 base pairs nucleic acidsingle linear unknown 54 AGCGTGCTGA GGGACGTCAG CGTGTTTGGC CAGAGGTGGGGCTTCGACGC CGCCACCATC 60 AA 62 50 base pairs nucleic acid single linearunknown 55 ACCATCAACA GCCGCTACAA CGACCTCACC AGGCTGATCG GCAACTACAC 50 53base pairs nucleic acid single linear unknown 56 CACGCTGTCC GCTGGTACAACACTGGCCTG GAGCGCGTCT GGGGCCCTGA TTC 53 17 base pairs nucleic acidsingle linear unknown 57 GGCGCTGGCT TCGTCCT 17 20 base pairs nucleicacid single linear unknown 58 CAAATCTACG CTGAGAGCTT 20 22 base pairsnucleic acid single linear unknown 59 TAACCCAGCT CTCCGCGAGG AG 22 18base pairs nucleic acid single linear unknown 60 CTTCGACGCC GCCACCAT 1838 base pairs nucleic acid single linear unknown 61 GGGCCCCCCTTCGAAGCCGA GTACGACCTG GAGAGAGC 38 36 base pairs nucleic acid singlelinear unknown 62 AAGGCTGTCA ATGAGCTCTT CACGTCCAGC AATCAG 36 36 basepairs nucleic acid single linear unknown 63 CAATCAGATC GGCCTGAAGACCGACGTCAC TGACTA 36 55 base pairs nucleic acid single linear unknown 64ACTGACTACC ACATCGACCA AGTCTCCAAC CTCGTGGAGT GCCTCTCCGA TGAGT 55 43 basepairs nucleic acid single linear unknown 65 ACGAGAAGAA GGAGCTGTCCGAGAAGGTGA AGCATGCCAA GCG 43 39 base pairs nucleic acid single linearunknown 66 GGAATCTCCT CCAGGACCCC AATTTCCGCG GCATCAACA 39 35 base pairsnucleic acid single linear unknown 67 CAGGCAGCTC GACCGCGGCT GGCGCGGCAGCACCG 35 32 base pairs nucleic acid single linear unknown 68 AGCACCGACATCACGATCCA GGGCGGCGAC GA 32 29 base pairs nucleic acid single linearunknown 69 AACTACGTGA CTCTCCTGGG CACTTTCGA 29 47 base pairs nucleic acidsingle linear unknown 70 GAGTCCAAGC TCAAGGCTTA CACTCGCTAC CAGCTCCGCGGCTACAT 47 41 base pairs nucleic acid single linear unknown 71CAAGACCTCG AGATTTACCT GATCCGCTAC AACGCCAAGC A 41 26 base pairs nucleicacid single linear unknown 72 GAGACCGTCA ACGTGCCCGG TACTGG 26 40 basepairs nucleic acid single linear unknown 73 CTCTGGCCGC TGAGCGCCCCCAGCCCGATC GGCAAGTGTG 40 22 base pairs nucleic acid single linearunknown 74 CCCACCACAG CCACCACTTC TC 22 32 base pairs nucleic acid singlelinear unknown 75 GATGTGGGCT GCACCGACCT GAACGAGGAC CT 32 35 base pairsnucleic acid single linear unknown 76 AAGACCCAGG ACGGCCACGA GCGCCTGGGCAACCT 35 42 base pairs nucleic acid single linear unknown 77 GGCAACCTGGAGTTCCTCGA GGGCAGGGCC CCCCTGGTCG GT 42 41 base pairs nucleic acid singlelinear unknown 78 GTCGGTGAGG CTCTGGCCAG GGTCAAGAGG GCTGAGAAGA A 41 41base pairs nucleic acid single linear unknown 79 AGGGACAAGC GCGAGAAGCTCGAGTGGGAG ACCAACATCG T 41 33 base pairs nucleic acid single linearunknown 80 GAGGCCAAGG AGAGCGTCGA CGCCCTGTTC GTG 33 32 base pairs nucleicacid single linear unknown 81 AACTCCCAGT ACGACCGCCT GCAGGCCGAC AC 32 28base pairs nucleic acid single linear unknown 82 ATCCACGCTG CCGACAAGAGGGTGCACA 28 35 base pairs nucleic acid single linear unknown 83GCATTCGCGA GGCCTACCTG CCTGAGCTGT CCGTG 35 35 base pairs nucleic acidsingle linear unknown 84 GCCATCTTTG AGGAGCTGGA GGGCCGCATC TTTAC 35 37base pairs nucleic acid single linear unknown 85 CATTCTCCCT GTACGACGCCCGCAACGTGA TCAAGAA 37 20 base pairs nucleic acid single linear unknown86 GGCCTCAGCT GGAATTCCTG 20 31 base pairs nucleic acid single linearunknown 87 CAAGAGGGCT GAGAAGAAGT GGAGGGACAA G 31 59 base pairs nucleicacid single linear unknown 88 TACTGGTTCC CTCTGGCCGC TGAGCGCCCCCAGCCCGATC GGCAAGTGTG CCCACCACA 59 49 base pairs nucleic acid singlelinear unknown 89 ATAAGCTTCA GCTGCTGGAA CGTCAAGGGC CACGTGGACG TCGAGGAAC49 40 base pairs nucleic acid single linear unknown 90 AGAACAACCACCGCTCCGTC CTGGTCGTCC CAGAGTGGGA 40 26 base pairs nucleic acid singlelinear unknown 91 GAGTGGGAGG CTGAGGTCTC CCAAGA 26 53 base pairs nucleicacid single linear unknown 92 CAAGAGGTCC GCGTCTGCCC AGGCCGCGGCTACATTCTCA GGGTCACCGC TTA 53 32 base pairs nucleic acid single linearunknown 93 AAGGAGGGCT ACGGTGAGGG CTGTGTGACC AT 32 35 base pairs nucleicacid single linear unknown 94 AACTGCGTGG AGGAGGAGGT GTACCCAAAC AACAC 3542 base pairs nucleic acid single linear unknown 95 GACTACACCGCCACCCAGGA GGAGTACGAG GGCACCTACA CT 42 60 base pairs nucleic acid singlelinear unknown 96 CCTACACTTC CAGGAACAGG GGCTACGATG GTGCCTACGA GAGCAACAGCAGCGTTCCTG 60 37 base pairs nucleic acid single linear unknown 97CTGACTACGC TTCCGCCTAC GAGGAGAAGG CCTACAC 37 31 base pairs nucleic acidsingle linear unknown 98 CCTACACGGA TGGCCGCAGG GACAACCCTT G 31 34 basepairs nucleic acid single linear unknown 99 CTTGCGAGAG CAACCGCGGCTACGGCGACT ACAC 34 34 base pairs nucleic acid single linear unknown 100GACTACACTC CCCTGCCCGC CGGCTACGTT ACCA 34 39 base pairs nucleic acidsingle linear unknown 101 AGGAGCTGGA GTACTTCCCG GAGACTGACA AGGTGTGGA 3931 base pairs nucleic acid single linear unknown 102 TCGAGATCGGCGAGACCGAG GGCACCTTCA T 31 41 base pairs nucleic acid single linearunknown 103 GTGGAGCTGC TCCTGATGGA GGAGTAGAAT TCCTCTAAGC T 41 59 basepairs nucleic acid single linear unknown 104 CTGGTCGTCC CAGAGTGGGAGGCTGAGGTC TCCCAAGAGG TCCGCGTCTG CCCAGGCCG 59 3484 base pairs nucleicacid single linear unknown 105 AGATCTCCAT GGACAACAAC CCAAACATCAACGAATGCAT TCCATACAAC TGCTTGAGTA 60 ACCCAGAAGT TGAAGTACTT GGTGGAGAACGCATTGAAAC CGGTTACACT CCCATCGACA 120 TCTCCTTGTC CTTGACACAG TTTCTGCTCAGCGAGTTCGT GCCAGGTGCT GGGTTCGTTC 180 TCGGACTAGT TGACATCATC TGGGGTATCTTTGGTCCATC TCAATGGGAT GCATTCCTGG 240 TGCAAATTGA GCAGTTGATC AACCAGAGGATCGAAGAGTT CGCCAGGAAC CAGGCCATCT 300 CTAGGTTGGA AGGATTGAGC AATCTCTACCAAATCTATGC AGAGAGCTTC AGAGAGTGGG 360 AAGCCGATCC TACTAACCCA GCTCTCCGCGAGGAAATGCG TATTCAATTC AACGACATGA 420 ACAGCGCCTT GACCACAGCT ATCCCATTGTTCGCAGTCCA GAACTACCAA GTTCCTCTCT 480 TGTCCGTGTA CGTTCAAGCA GCTAATCTTCACCTCAGCGT GCTTCGAGAC GTTAGCGTGT 540 TTGGGCAAAG GTGGGGATTC GATGCTGCAACCATCAATAG CCGTTACAAC GACCTTACTA 600 GGCTGATTGG AAACTACACC GACCACGCTGTTCGTTGGTA CAACACTGGC TTGGAGCGTG 660 TCTGGGGTCC TGATTCTAGA GATTGGATTAGATACAACCA GTTCAGGAGA GAATTGACCC 720 TCACAGTTTT GGACATTGTG TCTCTCTTCCCGAACTATGA CTCCAGAACC TACCCTATCC 780 GTACAGTGTC CCAACTTACC AGAGAAATCTATACTAACCC AGTTCTTGAG AACTTCGACG 840 GTAGCTTCCG TGGTTCTGCC CAAGGTATCGAAGGCTCCAT CAGGAGCCCA CACTTGATGG 900 ACATCTTGAA CAGCATAACT ATCTACACCGATGCTCACAG AGGAGAGTAT TACTGGTCTG 960 GACACCAGAT CATGGCCTCT CCAGTTGGATTCAGCGGGCC CGAGTTTACC TTTCCTCTCT 1020 ATGGAACTAT GGGAAACGCC GCTCCACAACAACGTATCGT TGCTCAACTA GGTCAGGGTG 1080 TCTACAGAAC CTTGTCTTCC ACCTTGTACAGAAGACCCTT CAATATCGGT ATCAACAACC 1140 AGCAACTTTC CGTTCTTGAC GGAACAGAGTTCGCCTATGG AACCTCTTCT AACTTGCCAT 1200 CCGCTGTTTA CAGAAAGAGC GGAACCGTTGATTCCTTGGA CGAAATCCCA CCACAGAACA 1260 ACAATGTGCC ACCCAGGCAA GGATTCTCCCACAGGTTGAG CCACGTGTCC ATGTTCCGTT 1320 CCGGATTCAG CAACAGTTCC GTGAGCATCATCAGAGCTCC TATGTTCTCA TGGATTCATC 1380 GTAGTGCTGA GTTCAACAAT ATCATTCCTTCCTCTCAAAT CACCCAAATC CCATTGACCA 1440 AGTCTACTAA CCTTGGATCT GGAACTTCTGTCGTGAAAGG ACCAGGCTTC ACAGGAGGTG 1500 ATATTCTTAG AAGAACTTCT CCTGGCCAGATTAGCACCCT CAGAGTTAAC ATCACTGCAC 1560 CACTTTCTCA AAGATATCGT GTCAGGATTCGTTACGCATC TACCACTAAC TTGCAATTCC 1620 ACACCTCCAT CGACGGAAGG CCTATCAATCAGGGTAACTT CTCCGCAACC ATGTCAAGCG 1680 GCAGCAACTT GCAATCCGGC AGCTTCAGAACCGTCGGTTT CACTACTCCT TTCAACTTCT 1740 CTAACGGATC AAGCGTTTTC ACCCTTAGCGCTCATGTGTT CAATTCTGGC AATGAAGTGT 1800 ACATTGACCG TATTGAGTTT GTGCCTGCCGAAGTTACCTT CGAAGCCGAG TACGACCTGG 1860 AGAGAGCCCA GAAGGCTGTC AATGAGCTCTTCACGTCCAG CAATCAGATC GGCCTGAAGA 1920 CCGACGTCAC TGACTACCAC ATCGACCAAGTCTCCAACCT CGTGGAGTGC CTCTCCGATG 1980 AGTTCTGCCT CGACGAGAAG AAGGAGCTGTCCGAGAAGGT GAAGCATGCC AAGCGTCTCA 2040 GCGACGAGAG GAATCTCCTC CAGGACCCCAATTTCCGCGG CATCAACAGG CAGCTCGACC 2100 GCGGCTGGCG CGGCAGCACC GACATCACGATCCAGGGCGG CGACGATGTG TTCAAGGAGA 2160 ACTACGTGAC TCTCCTGGGC ACTTTCGACGAGTGCTACCC TACCTACTTG TACCAGAAGA 2220 TCGATGAGTC CAAGCTCAAG GCTTACACTCGCTACCAGCT CCGCGGCTAC ATCGAAGACA 2280 GCCAAGACCT CGAGATTTAC CTGATCCGCTACAACGCCAA GCACGAGACC GTCAACGTGC 2340 CCGGTACTGG TTCCCTCTGG CCGCTGAGCGCCCCCAGCCC GATCGGCAAG TGTGCCCACC 2400 ACAGCCACCA CTTCTCCTTG GACATCGATGTGGGCTGCAC CGACCTGAAC GAGGACCTCG 2460 GAGTCTGGGT CATCTTCAAG ATCAAGACCCAGGACGGCCA CGAGCGCCTG GGCAACCTGG 2520 AGTTCCTCGA GGGCAGGGCC CCCCTGGTCGGTGAGGCTCT GGCCAGGGTC AAGAGGGCTG 2580 AGAAGAAGTG GAGGGACAAG CGCGAGAAGCTCGAGTGGGA GACCAACATC GTTTACAAGG 2640 AGGCCAAGGA GAGCGTCGAC GCCCTGTTCGTGAACTCCCA GTACGACCGC CTGCAGGCCG 2700 ACACCAACAT CGCCATGATC CACGCTGCCGACAAGAGGGT GCACAGCATT CGCGAGGCCT 2760 ACCTGCCTGA GCTGTCCGTG ATCCCTGGTGTGAACGCTGC CATCTTTGAG GAGCTGGAGG 2820 GCCGCATCTT TACCGCATTC TCCCTGTACGACGCCCGCAA CGTGATCAAG AACGGTGACT 2880 TCAACAATGG CCTCAGCTGC TGGAACGTCAAGGGCCACGT GGACGTCGAG GAACAGAACA 2940 ACCACCGCTC CGTCCTGGTC GTCCCAGAGTGGGAGGCTGA GGTCTCCCAA GAGGTCCGCG 3000 TCTGCCCAGG CCGCGGCTAC ATTCTCAGGGTCACCGCTTA CAAGGAGGGC TACGGTGAGG 3060 GCTGTGTGAC CATCCACGAG ATCGAGAACAACACCGACGA GCTTAAGTTC TCCAACTGCG 3120 TGGAGGAGGA GGTGTACCCA AACAACACCGTTACTTGCAA CGACTACACC GCCACCCAGG 3180 AGGAGTACGA GGGCACCTAC ACTTCCAGGAACAGGGGCTA CGATGGTGCC TACGAGAGCA 3240 ACAGCAGCGT TCCTGCTGAC TACGCTTCCGCCTACGAGGA GAAGGCCTAC ACGGATGGCC 3300 GCAGGGACAA CCCTTGCGAG AGCAACCGCGGCTACGGCGA CTACACTCCC CTGCCCGCCG 3360 GCTACGTTAC CAAGGAGCTG GAGTACTTCCCGGAGACTGA CAAGGTGTGG ATCGAGATCG 3420 GCGAGACCGA GGGCACCTTC ATCGTGGACAGCGTGGAGCT GCTCCTGATG GAGGAGTAGA 3480 ATTC 3484 1919 base pairs nucleicacid single linear unknown 106 ATGGACAACA ACGTCTTGAA CTCTGGTAGAACAACCATCT GCGACGCATA CAACGTCGTG 60 GCTCACGATC CATTCAGCTT CGAACACAAGAGCCTCGACA CTATTCAGAA GGAGTGGATG 120 GAATGGAAAC GTACTGACCA CTCTCTCTACGTCGCACCTG TGGTTGGAAC AGTGTCCAGC 180 TTCCTTCTCA AGAAGGTCGG CTCTCTCATCGGAAAACGTA TCTTGTCCGA ACTCTGGGGT 240 ATCATCTTTC CATCTGGGTC CACTAATCTCATGCAAGACA TCTTGAGGGA GACCGAACAG 300 TTTCTCAACC AGCGTCTCAA CACTGATACCTTGGCTAGAG TCAACGCTGA GTTGATCGGT 360 CTCCAAGCAA ACATTCGTGA GTTCAACCAGCAAGTGGACA ACTTCTTGAA TCCAACTCAG 420 AATCCTGTGC CTCTTTCCAT CACTTCTTCCGTGAACACTA TGCAGCAACT CTTCCTCAAC 480 AGATTGCCTC AGTTTCAGAT TCAAGGCTACCAGTTGCTCC TTCTTCCACT CTTTGCTCAG 540 GCTGCCAACA TGCACTTGTC CTTCATACGTGACGTGATCC TCAACGCTGA CGAATGGGGA 600 ATCTCTGCAG CCACTCTTAG GACATACAGAGACTACTTGA GGAACTACAC TCGTGATTAC 660 TCCAACTATT GCATCAACAC TTATCAGACTGCCTTTCGTG GACTCAATAC TAGGCTTCAC 720 GACATGCTTG AGTTCAGGAC CTACATGTTCCTTAACGTGT TTGAGTACGT CAGCATTTGG 780 AGTCTCTTCA AGTACCAGAG CTTGATGGTGTCCTCTGGAG CCAATCTCTA CGCCTCTGGC 840 AGTGGACCAC AGCAAACTCA GAGCTTCACAGCTCAGAACT GGCCATTCTT GTATAGCTTG 900 TTCCAAGTCA ACTCCAACTA CATTCTCAGTGGTATCTCTG GGACCAGACT CTCCATAACC 960 TTTCCCAACA TTGGTGGACT TCCAGGCTCCACTACAACCC ATAGCCTTAA CTCTGCCAGA 1020 GTGAACTACA GTGGAGGTGT CAGCTCTGGATTGATTGGTG CAACTAACTT GAACCACAAC 1080 TTCAATTGCT CCACCGTCTT GCCACCTCTGAGCACACCGT TTGTGAGGTC CTGGCTTGAC 1140 AGCGGTACTG ATCGCGAAGG AGTTGCTACCTCTACAAACT GGCAAACCGA GTCCTTCCAA 1200 ACCACTCTTA GCCTTCGGTG TGGAGCTTTCTCTGCACGTG GGAATTCAAA CTACTTTCCA 1260 GACTACTTCA TTAGGAACAT CTCTGGTGTTCCTCTCGTCA TCAGGAATGA AGACCTCACC 1320 CGTCCACTTC ATTACAACCA GATTAGGAACATCGAGTCTC CATCCGGTAC TCCAGGAGGT 1380 GCAAGAGCTT ACCTCGTGTC TGTCCATAACAGGAAGAACA ACATCTACGC TGCCAACGAG 1440 AATGGCACCA TGATTCACCT TGCACCAGAAGATTACACTG GATTCACCAT CTCTCCAATC 1500 CATGCTACCC AAGTGAACAA TCAGACACGCACCTTCATCT CCGAAAAGTT CGGAAATCAA 1560 GGTGACTCCT TGAGGTTCGA GCAATCCAACACTACCGCTA GGTACACTTT GAGAGGCAAT 1620 GGAAACAGCT ACAACCTTTA CTTGAGAGTTAGCTCCATTG GTAACTCCAC CATCCGTGTT 1680 ACCATCAACG GACGTGTTTA CACAGTCTCTAATGTGAACA CTACAACGAA CAATGATGGC 1740 GTTAACGACA ACGGAGCCAG ATTCAGCGACATCAACATTG GCAACATCGT GGCCTCTGAC 1800 AACACTAACG TTACTTTGGA CATCAATGTGACCCTCAATT CTGGAACTCC ATTTGATCTC 1860 ATGAACATCA TGTTTGTGCC AACTAACCTCCCTCCATTGT ACTAATGAGA TCTAAGCTT 1919 57 base pairs nucleic acid singlelinear unknown 107 TCTAGAAGAT CTCCACCATG GACAACTCCG TCCTGAACTCTGGTCGCACC ACCATCT 57 70 base pairs nucleic acid single linear unknown108 GCGACGCCTA CAACGTCGCG GCGCATGATC CATTCAGCTT CCAGCACAAG AGCCTCGACA 60CTGTTCAGAA 70 75 base pairs nucleic acid single linear unknown 109GGAGTGGACG GAGTGGAAGA AGAACAACCA CAGCCTGTAC CTGGACCCCA TCGTCGGCAC 60GGTGGCCAGC TTCCT 75 68 base pairs nucleic acid single linear unknown 110TCTCAAGAAG GTCGGCTCTC TCGTCGGGAA GCGCATCCTC TCGGAACTCC GCAACCTGAT 60CAGGATCC 68 20 base pairs nucleic acid single linear unknown 111CCATCTAGAA GATCTCCACC 20 20 base pairs nucleic acid single linearunknown 112 TGGGGATCCT GATCAGGTTG 20 81 base pairs nucleic acid singlelinear unknown 113 AGATCTTTCC ATCTGGCTCC ACCAACCTCA TGCAAGACATCCTCAGGGAG ACCGAGAAGT 60 TTCTCAACCA GCGCCTCAAC A 81 75 base pairsnucleic acid single linear unknown 114 CTGATACCCT TGCTCGCGTC AACGCTGAGCTGACGGGTCT GCAAGCAAAC GTGGAGGAGT 60 TCAACCGCCA AGTGG 75 45 base pairsnucleic acid single linear unknown 115 ACAACTTCCT CAACCCCAAC CGCAATGCGGTGCCTCTGTC CATCA 45 65 base pairs nucleic acid single linear unknown 116CTTCTTCCGT GAACACCATG CAACAACTGT TCCTCAACCG CTTGCCTCAG TTCCAGATGC 60AAGGC 65 69 base pairs nucleic acid single linear unknown 117 TACCAGCTGCTCCTGCTGCC ACTCTTTGCT CAGGCTGCCA ACCTGCACCT CTCCTTCATT 60 CGTGACGTG 6934 base pairs nucleic acid single linear unknown 118 ATCCTCAACGCTGACGAGTG GGGCATCTCT GCAG 34 20 base pairs nucleic acid single linearunknown 119 CCAAGATCTT TCCATCTGGC 20 20 base pairs nucleic acid singlelinear unknown 120 GGTCTGCAGA GATGCCCCAC 20 67 base pairs nucleic acidsingle linear unknown 121 CTGCAGCCAC GCTGAGGACC TACCGCGACT ACCTGAAGAACTACACCAGG GACTACTCCA 60 ACTATTG 67 69 base pairs nucleic acid singlelinear unknown 122 CATCAACACC TACCAGTCGG CCTTCAAGGG CCTCAATACGAGGCTTCACG ACATGCTGGA 60 GTTCAGGAC 69 52 base pairs nucleic acid singlelinear unknown 123 CTACATGTTC CTGAACGTGT TCGAGTACGT CAGCATCTGGTCGCTCTTCA AG 52 68 base pairs nucleic acid single linear unknown 124TACCAGAGCC TGCTGGTGTC CAGCGGCGCC AACCTCTACG CCAGCGGCTC TGGTCCCCAA 60CAAACTCA 68 51 base pairs nucleic acid single linear unknown 125GAGCTTCACC AGCCAGGACT GGCCATTCCT GTATTCGTTG TTCCAAGTCA A 51 57 basepairs nucleic acid single linear unknown 126 CTCCAACTAC GTCCTCAACGGCTTCTCTGG TGCTCGCCTC TCCAACACCT TCCCCAA 57 78 base pairs nucleic acidsingle linear unknown 127 CATTGTTGGC CTCCCCGGCT CCACCACAAC TCATGCTCTGCTTGCTGCCA GAGTGAACTA 60 CTCCGGCGGC ATCTCGAG 78 23 base pairs nucleicacid single linear unknown 128 CCACTGCAGC CACGCTGAGG ACC 23 20 basepairs nucleic acid single linear unknown 129 GGTCTCGAGA TGCCGCCGGA 20 76base pairs nucleic acid single linear unknown 130 ATTGGTGCAT CGCCGTTCAACCAGAACTTC AACTGCTCCA CCTTCCTGCC GCCGCTGCTC 60 ACCCCGTTCG TGAGGT 76 59base pairs nucleic acid single linear unknown 131 CCTGGCTCGA CAGCGGCTCCGACCGCGAGG GCGTGGCCAC CGTCACCAAC TGGCAAACC 59 54 base pairs nucleic acidsingle linear unknown 132 GAGTCCTTCG AGACCACCCT TGGCCTCCGG AGCGGCGCCTTCACGGCGCG TGGG 54 44 base pairs nucleic acid single linear unknown 133AATTCTAACT ACTTCCCCGA CTACTTCATC AGGAACATCT CTGG 44 63 base pairsnucleic acid single linear unknown 134 TGTTCCTCTC GTCGTCCGCA ACGAGGACCTCCGCCGTCCA CTGCACTACA ACGAGATCAG 60 GAA 63 61 base pairs nucleic acidsingle linear unknown 135 CATCGCCTCT CCGTCCGGGA CGCCCGGAGG TGCAAGGGCGTACATGGTGA GCGTCCATAA 60 C 61 44 base pairs nucleic acid single linearunknown 136 AGGAAGAACA ACATCCACGC TGTGCATGAG AACGGCTCCA TGAT 44 31 basepairs nucleic acid single linear unknown 137 CCACTCGAGC GGCGACATTGGTGCATCGCC G 31 33 base pairs nucleic acid single linear unknown 138GGTGGTACCT GATCATGGAG CCGTTCTCAT GCA 33 64 base pairs nucleic acidsingle linear unknown 139 GGATCCACCT GGCGCCCAAT GATTACACCG GCTTCACCATCTCTCCAATC CACGCCACCC 60 AAGT 64 62 base pairs nucleic acid singlelinear unknown 140 GAACAACCAG ACACGCACCT TCATCTCCGA GAAGTTCGGCAACCAGGGCG ACTCCCTGAG 60 GT 62 81 base pairs nucleic acid single linearunknown 141 TCGAGCAGAA CAACACCACC GCCAGGTACA CCCTGCGCGG CAACGGCAACAGCTACAACC 60 TGTACCTGCG CGTCAGCTCC A 81 78 base pairs nucleic acidsingle linear unknown 142 TTGGCAACTC CACCATCAGG GTCACCATCA ACGGGAGGGTGTACACAGCC ACCAATGTGA 60 ACACGACGAC CAACAATG 78 41 base pairs nucleicacid single linear unknown 143 ATGGCGTCAA CGACAACGGC GCCCGCTTCAGCGACATCAA C 41 47 base pairs nucleic acid single linear unknown 144ATTGGCAACG TGGTGGCCAG CAGCAACTCC GACGTCCCGC TGGACAT 47 42 base pairsnucleic acid single linear unknown 145 CAACGTGACC CTGAACTCTG GCACCCAGTTCGACCTCATG AA 42 61 base pairs nucleic acid single linear unknown 146CATCATGCTG GTGCCAACTA ACATCTCGCC GCTGTACTGA TAGGAGCTCT GATCAGGTAC 60 C61 21 base pairs nucleic acid single linear unknown 147 GGAGGATCCACCTGGCGCCC A 21 20 base pairs nucleic acid single linear unknown 148GGTGGTACCT GATCAGAGCT 20 20 base pairs nucleic acid single linearunknown 149 CCACCATGGA CAACTCCGTC 20 34 base pairs nucleic acid singlelinear unknown 150 GGAAGAAGAA CAACCACAGC CTGTACCTGG ACCC 34 20 basepairs nucleic acid single linear unknown 151 CCACCAACCT CATGCAAGAC 20 20base pairs nucleic acid single linear unknown 152 CTCAACCAGC GCCTCAACAC20 37 base pairs nucleic acid single linear unknown 153 CCGCAATGCGGTGCCTCTGT CCATCACTTC TTCCGTG 37 19 base pairs nucleic acid singlelinear unknown 154 CGTGACGTGA TCCTCAACG 19 19 base pairs nucleic acidsingle linear unknown 155 GGACTGGCCA TTCCTGTAT 19 19 base pairs nucleicacid single linear unknown 156 CGCCAGCGGC TCTGGTCCC 19 19 base pairsnucleic acid single linear unknown 157 GAAGAACTAC ACCAGGGAC 19 20 basepairs nucleic acid single linear unknown 158 GCTCCGACCG CGAGGGCGTG 20 35base pairs nucleic acid single linear unknown 159 CTCCGGAGCG GCGCCTTCACGGCGCGTGGG AATTC 35 20 base pairs nucleic acid single linear unknown 160CATCTCTGGT GTTCCTCTCG 20 20 base pairs nucleic acid single linearunknown 161 GCGGCAACGG CAACAGCTAC 20 22 base pairs nucleic acid singlelinear unknown 162 CTCCACCATC AGGGTCACCA TC 22 18 base pairs nucleicacid single linear unknown 163 GAACATCATG CTGGTGCC 18 3471 base pairsnucleic acid single linear unknown 164 ATGGATAACA ATCCGAACAT CAATGAATGCATTCCTTATA ATTGTTTAAG TAACCCTGAA 60 GTAGAAGTAT TAGGTGGAGA AAGAATAGAAACTGGTTACA CCCCAATCGA TATTTCCTTG 120 TCGCTAACGC AATTTCTTTT GAGTGAATTTGTTCCCGGTG CTGGATTTGT GTTAGGACTA 180 GTTGATATAA TATGGGGAAT TTTTGGTCCCTCTCAATGGG ACGCATTTCT TGTACAAATT 240 GAACAGTTAA TTAACCAAAG AATAGAAGAATTCGCTAGGA ACCAAGCCAT TTCTAGATTA 300 GAAGGACTAA GCAATCTTTA TCAAATTTACGCAGAATCTT TTAGAGAGTG GGAAGCAGAT 360 CCTACTAATC CAGCATTAAG AGAAGAGATGCGTATTCAAT TCAATGACAT GAACAGTGCC 420 CTTACAACCG CTATTCCTCT TTTTGCAGTTCAAAATTATC AAGTTCCTCT TTTATCAGTA 480 TATGTTCAAG CTGCAAATTT ACATTTATCAGTTTTGAGAG ATGTTTCAGT GTTTGGACAA 540 AGGTGGGGAT TTGATGCCGC GACTATCAATAGTCGTTATA ATGATTTAAC TAGGCTTATT 600 GGCAACTATA CAGATCATGC TGTACGCTGGTACAATACGG GATTAGAGCG TGTATGGGGA 660 CCGGATTCTA GAGATTGGAT AAGATATAATCAATTTAGAA GAGAATTAAC ACTAACTGTA 720 TTAGATATCG TTTCTCTATT TCCGAACTATGATAGTAGAA CGTATCCAAT TCGAACAGTT 780 TCCCAATTAA CAAGAGAAAT TTATACAAACCCAGTATTAG AAAATTTTGA TGGTAGTTTT 840 CGAGGCTCGG CTCAGGGCAT AGAAGGAAGTATTAGGAGTC CACATTTGAT GGATATACTT 900 AATAGTATAA CCATCTATAC GGATGCTCATAGAGGAGAAT ATTATTGGTC AGGGCATCAA 960 ATAATGGCTT CTCCTGTAGG GTTTTCGGGGCCAGAATTCA CTTTTCCGCT ATATGGAACT 1020 ATGGGAAATG CAGCTCCACA ACAACGTATTGTTGCTCAAC TAGGTCAGGG CGTGTATAGA 1080 ACATTATCGT CCACCTTATA TAGAAGACCTTTTAATATAG GGATAAATAA TCAACAACTA 1140 TCTGTTCTTG ACGGGACAGA ATTTGCTTATGGAACCTCCT CAAATTTGCC ATCCGCTGTA 1200 TACAGAAAAA GCGGAACGGT AGATTCGCTGGATGAAATAC CGCCACAGAA TAACAACGTG 1260 CCACCTAGGC AAGGATTTAG TCATCGATTAAGCCATGTTT CAATGTTTCG TTCAGGCTTT 1320 AGTAATAGTA GTGTAAGTAT AATAAGAGCTCCTATGTTCT CTTGGATACA TCGTAGTGCT 1380 GAATTTAATA ATATAATTCC TTCATCACAAATTACACAAA TACCTTTAAC AAAATCTACT 1440 AATCTTGGCT CTGGAACTTC TGTCGTTAAAGGACCAGGAT TTACAGGAGG AGATATTCTT 1500 CGAAGAACTT CACCTGGCCA GATTTCAACCTTAAGAGTAA ATATTACTGC ACCATTATCA 1560 CAAAGATATC GGGTAAGAAT TCGCTACGCTTCTACCACAA ATTTACAATT CCATACATCA 1620 ATTGACGGAA GACCTATTAA TCAGGGGAATTTTTCAGCAA CTATGAGTAG TGGGAGTAAT 1680 TTACAGTCCG GAAGCTTTAG GACTGTAGGTTTTACTACTC CGTTTAACTT TTCAAATGGA 1740 TCAAGTGTAT TTACGTTAAG TGCTCATGTCTTCAATTCAG GCAATGAAGT TTATATAGAT 1800 CGAATTGAAT TTGTTCCGGC AGAAGTAACCTTTGAGGCAG AATATGATTT AGAAAGAGCA 1860 CAAAAGGCGG TGAATGAGCT GTTTACTTCTTCCAATCAAA TCGGGTTAAA AACAGATGTG 1920 ACGGATTATC ATATTGATCA AGTATCCAATTTAGTTGAGT GTTTATCTGA TGAATTTTGT 1980 CTGGATGAAA AAAAAGAATT GTCCGAGAAAGTCAAACATG CGAAGCGACT TAGTGATGAG 2040 CGGAATTTAC TTCAAGATCC AAACTTTAGAGGGATCAATA GACAACTAGA CCGTGGCTGG 2100 AGAGGAAGTA CGGATATTAC CATCCAAGGAGGCGATGACG TATTCAAAGA GAATTACGTT 2160 ACGCTATTGG GTACCTTTGA TGAGTGCTATCCAACGTATT TATATCAAAA AATAGATGAG 2220 TCGAAATTAA AAGCCTATAC CCGTTACCAATTAAGAGGGT ATATCGAAGA TAGTCAAGAC 2280 TTAGAAATCT ATTTAATTCG CTACAATGCCAAACACGAAA CAGTAAATGT GCCAGGTACG 2340 GGTTCCTTAT GGCCGCTTTC AGCCCCAAGTCCAATCGGAA AATGTGCCCA TCATTCCCAT 2400 CATTTCTCCT TGGACATTGA TGTTGGATGTACAGACTTAA ATGAGGACTT AGGTGTATGG 2460 GTGATATTCA AGATTAAGAC GCAAGATGGCCATGAAAGAC TAGGAAATCT AGAATTTCTC 2520 GAAGGAAGAG CACCATTAGT AGGAGAAGCACTAGCTCGTG TGAAAAGAGC GGAGAAAAAA 2580 TGGAGAGACA AACGTGAAAA ATTGGAATGGGAAACAAATA TTGTTTATAA AGAGGCAAAA 2640 GAATCTGTAG ATGCTTTATT TGTAAACTCTCAATATGATA GATTACAAGC GGATACCAAC 2700 ATCGCGATGA TTCATGCGGC AGATAAACGCGTTCATAGCA TTCGAGAAGC TTATCTGCCT 2760 GAGCTGTCTG TGATTCCGGG TGTCAATGCGGCTATTTTTG AAGAATTAGA AGGGCGTATT 2820 TTCACTGCAT TCTCCCTATA TGATGCGAGAAATGTCATTA AAAATGGTGA TTTTAATAAT 2880 GGCTTATCCT GCTGGAACGT GAAAGGGCATGTAGATGTAG AAGAACAAAA CAACCACCGT 2940 TCGGTCCTTG TTGTTCCGGA ATGGGAAGCAGAAGTGTCAC AAGAAGTTCG TGTCTGTCCG 3000 GGTCGTGGCT ATATCCTTCG TGTCACAGCGTACAAGGAGG GATATGGAGA AGGTTGCGTA 3060 ACCATTCATG AGATCGAGAA CAATACAGACGAACTGAAGT TTAGCAACTG TGTAGAAGAG 3120 GAAGTATATC CAAACAACAC GGTAACGTGTAATGATTATA CTGCGACTCA AGAAGAATAT 3180 GAGGGTACGT ACACTTCTCG TAATCGAGGATATGACGGAG CCTATGAAAG CAATTCTTCT 3240 GTACCAGCTG ATTATGCATC AGCCTATGAAGAAAAAGCAT ATACAGATGG ACGAAGAGAC 3300 AATCCTTGTG AATCTAACAG AGGATATGGGGATTACACAC CACTACCAGC TGGCTATGTG 3360 ACAAAAGAAT TAGAGTACTT CCCAGAAACCGATAAGGTAT GGATTGAGAT CGGAGAAACG 3420 GAAGGAACAT TCATCGTGGA CAGCGTGGAATTACTTCTTA TGGAGGAATA A 3471

What is claimed is:
 1. A nucleic acid comprising nucleotides 669-1348 ofSEQ ID NO:
 1. 2. A monocotyledonous plant containing the nucleic acid ofclaim
 1. 3. The monocotyledonous plant of claim 2 wherein the plant ismaize.
 4. The monocotyledonous plant of claim 3 wherein the nucleic acidis operably linked to a promoter selected from the group consisting oftissue specific promoters, pith specific promoters, constitutivepromoters, inducible promoters, and meristematic tissue specificpromoters.